All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 1/6] ext4: Fixes ext4_mb_mark_bb() with flex_bg with fast_commit
       [not found] <cover.1643642105.git.riteshh@linux.ibm.com>
@ 2022-01-31 15:16 ` Ritesh Harjani
  2022-02-01 11:21   ` Jan Kara
  2022-01-31 15:16 ` [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function Ritesh Harjani
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Ritesh Harjani @ 2022-01-31 15:16 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, Theodore Ts'o, Jan Kara, Harshad Shirwadkar,
	Ritesh Harjani

In case of flex_bg feature (which is by default enabled), extents for
any given inode might span across blocks from two different block group.
ext4_mb_mark_bb() only reads the buffer_head of block bitmap once for the
starting block group, but it fails to read it again when the extent length
boundary overflows to another block group. Then in this below loop it
accesses memory beyond the block group bitmap buffer_head and results
into a data abort.

	for (i = 0; i < clen; i++)
		if (!mb_test_bit(blkoff + i, bitmap_bh->b_data) == !state)
			already++;

This patch adds this functionality for checking block group boundary in
ext4_mb_mark_bb() and update the buffer_head(bitmap_bh) for every different
block group.

w/o this patch, I was easily able to hit a data access abort using Power platform.

<...>
[   74.327662] EXT4-fs error (device loop3): ext4_mb_generate_buddy:1141: group 11, block bitmap and bg descriptor inconsistent: 21248 vs 23294 free clusters
[   74.533214] EXT4-fs (loop3): shut down requested (2)
[   74.536705] Aborting journal on device loop3-8.
[   74.702705] BUG: Unable to handle kernel data access on read at 0xc00000005e980000
[   74.703727] Faulting instruction address: 0xc0000000007bffb8
cpu 0xd: Vector: 300 (Data Access) at [c000000015db7060]
    pc: c0000000007bffb8: ext4_mb_mark_bb+0x198/0x5a0
    lr: c0000000007bfeec: ext4_mb_mark_bb+0xcc/0x5a0
    sp: c000000015db7300
   msr: 800000000280b033
   dar: c00000005e980000
 dsisr: 40000000
  current = 0xc000000027af6880
  paca    = 0xc00000003ffd5200   irqmask: 0x03   irq_happened: 0x01
    pid   = 5167, comm = mount
<...>
enter ? for help
[c000000015db7380] c000000000782708 ext4_ext_clear_bb+0x378/0x410
[c000000015db7400] c000000000813f14 ext4_fc_replay+0x1794/0x2000
[c000000015db7580] c000000000833f7c do_one_pass+0xe9c/0x12a0
[c000000015db7710] c000000000834504 jbd2_journal_recover+0x184/0x2d0
[c000000015db77c0] c000000000841398 jbd2_journal_load+0x188/0x4a0
[c000000015db7880] c000000000804de8 ext4_fill_super+0x2638/0x3e10
[c000000015db7a40] c0000000005f8404 get_tree_bdev+0x2b4/0x350
[c000000015db7ae0] c0000000007ef058 ext4_get_tree+0x28/0x40
[c000000015db7b00] c0000000005f6344 vfs_get_tree+0x44/0x100
[c000000015db7b70] c00000000063c408 path_mount+0xdd8/0xe70
[c000000015db7c40] c00000000063c8f0 sys_mount+0x450/0x550
[c000000015db7d50] c000000000035770 system_call_exception+0x4a0/0x4e0
[c000000015db7e10] c00000000000c74c system_call_common+0xec/0x250
--- Exception: c00 (System Call) at 00007ffff7dbfaa4

Fixes: 8016e29f4362e28 ("ext4: fast commit recovery path")
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/mballoc.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c781974df9d0..8d23108cf9d7 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3899,12 +3899,29 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	ext4_group_t group;
 	ext4_grpblk_t blkoff;
-	int i, clen, err;
+	int i, err;
 	int already;
+	unsigned int clen, overflow;
 
-	clen = EXT4_B2C(sbi, len);
-
+again:
+	overflow = 0;
 	ext4_get_group_no_and_offset(sb, block, &group, &blkoff);
+
+	/*
+	 * Check to see if we are freeing blocks across a group
+	 * boundary.
+	 * In case of flex_bg, this can happen that (block, len) may span across
+	 * more than one group. In that case we need to get the corresponding
+	 * group metadata to work with. For this we have goto again loop.
+	 */
+	if (EXT4_C2B(sbi, blkoff) + len > EXT4_BLOCKS_PER_GROUP(sb)) {
+		overflow = EXT4_C2B(sbi, blkoff) + len -
+			EXT4_BLOCKS_PER_GROUP(sb);
+		len -= overflow;
+	}
+
+	clen = EXT4_NUM_B2C(sbi, len);
+
 	bitmap_bh = ext4_read_block_bitmap(sb, group);
 	if (IS_ERR(bitmap_bh)) {
 		err = PTR_ERR(bitmap_bh);
@@ -3960,6 +3977,13 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
 	err = ext4_handle_dirty_metadata(NULL, NULL, gdp_bh);
 	sync_dirty_buffer(gdp_bh);
 
+	if (overflow && !err) {
+		block += len;
+		len = overflow;
+		put_bh(bitmap_bh);
+		goto again;
+	}
+
 out_err:
 	brelse(bitmap_bh);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function
       [not found] <cover.1643642105.git.riteshh@linux.ibm.com>
  2022-01-31 15:16 ` [RFC 1/6] ext4: Fixes ext4_mb_mark_bb() with flex_bg with fast_commit Ritesh Harjani
@ 2022-01-31 15:16 ` Ritesh Harjani
  2022-02-01 11:34   ` Jan Kara
  2022-01-31 15:16 ` [RFC 3/6] ext4: Use in_range() for range checking in ext4_fc_replay_check_excluded Ritesh Harjani
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Ritesh Harjani @ 2022-01-31 15:16 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, Theodore Ts'o, Jan Kara, Harshad Shirwadkar,
	Ritesh Harjani

This patch implements ext4_group_block_valid() check functionality,
and refactors all the callers to use this common function instead.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/block_validity.c | 31 +++++++++++++++++++++++++++++++
 fs/ext4/ext4.h           |  3 +++
 fs/ext4/mballoc.c        | 16 +++-------------
 3 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
index 4666b55b736e..01d822c664df 100644
--- a/fs/ext4/block_validity.c
+++ b/fs/ext4/block_validity.c
@@ -361,3 +361,34 @@ int ext4_check_blockref(const char *function, unsigned int line,
 	return 0;
 }
 
+/*
+ * ext4_group_block_valid - This checks if any of FS metadata blocks of a
+ * given group (@bg) lies in the given range [block, block + count - 1]
+ * or not.
+ *
+ * Return -
+ * - false if it does
+ * - else true
+ */
+bool ext4_group_block_valid(struct super_block *sb, ext4_group_t bg,
+			    ext4_fsblk_t block, unsigned int count)
+{
+	struct ext4_group_desc *gdp;
+	bool ret = true;
+
+	gdp = ext4_get_group_desc(sb, bg, NULL);
+	if (!gdp) {
+		ret = false;
+		goto out;
+	}
+
+	if (in_range(ext4_block_bitmap(sb, gdp), block, count) ||
+	    in_range(ext4_inode_bitmap(sb, gdp), block, count) ||
+	    in_range(block, ext4_inode_table(sb, gdp),
+		    EXT4_SB(sb)->s_itb_per_group) ||
+	    in_range(block + count - 1, ext4_inode_table(sb, gdp),
+		    EXT4_SB(sb)->s_itb_per_group))
+		ret = false;
+out:
+	return ret;
+}
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 18cd5b3b4815..fc7aa4b3e415 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3706,6 +3706,9 @@ extern int ext4_inode_block_valid(struct inode *inode,
 				  unsigned int count);
 extern int ext4_check_blockref(const char *, unsigned int,
 			       struct inode *, __le32 *, unsigned int);
+extern bool ext4_group_block_valid(struct super_block *sb, ext4_group_t bg,
+				   ext4_fsblk_t block, unsigned int count);
+
 
 /* extents.c */
 struct ext4_ext_path;
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 8d23108cf9d7..60d32d3d8dc4 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -6001,13 +6001,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 		goto error_return;
 	}
 
-	if (in_range(ext4_block_bitmap(sb, gdp), block, count) ||
-	    in_range(ext4_inode_bitmap(sb, gdp), block, count) ||
-	    in_range(block, ext4_inode_table(sb, gdp),
-		     sbi->s_itb_per_group) ||
-	    in_range(block + count - 1, ext4_inode_table(sb, gdp),
-		     sbi->s_itb_per_group)) {
-
+	if (!ext4_group_block_valid(sb, block_group, block, count)) {
 		ext4_error(sb, "Freeing blocks in system zone - "
 			   "Block = %llu, count = %lu", block, count);
 		/* err = 0. ext4_std_error should be a no op */
@@ -6078,7 +6072,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 						 NULL);
 			if (err && err != -EOPNOTSUPP)
 				ext4_msg(sb, KERN_WARNING, "discard request in"
-					 " group:%d block:%d count:%lu failed"
+					 " group:%u block:%d count:%lu failed"
 					 " with %d", block_group, bit, count,
 					 err);
 		} else
@@ -6194,11 +6188,7 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
 		goto error_return;
 	}
 
-	if (in_range(ext4_block_bitmap(sb, desc), block, count) ||
-	    in_range(ext4_inode_bitmap(sb, desc), block, count) ||
-	    in_range(block, ext4_inode_table(sb, desc), sbi->s_itb_per_group) ||
-	    in_range(block + count - 1, ext4_inode_table(sb, desc),
-		     sbi->s_itb_per_group)) {
+	if (!ext4_group_block_valid(sb, block_group, block, count)) {
 		ext4_error(sb, "Adding blocks in system zones - "
 			   "Block = %llu, count = %lu",
 			   block, count);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC 3/6] ext4: Use in_range() for range checking in ext4_fc_replay_check_excluded
       [not found] <cover.1643642105.git.riteshh@linux.ibm.com>
  2022-01-31 15:16 ` [RFC 1/6] ext4: Fixes ext4_mb_mark_bb() with flex_bg with fast_commit Ritesh Harjani
  2022-01-31 15:16 ` [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function Ritesh Harjani
@ 2022-01-31 15:16 ` Ritesh Harjani
  2022-02-01 11:35   ` Jan Kara
  2022-01-31 15:16 ` [RFC 4/6] ext4: No need to test for block bitmap bits in ext4_mb_mark_bb() Ritesh Harjani
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Ritesh Harjani @ 2022-01-31 15:16 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, Theodore Ts'o, Jan Kara, Harshad Shirwadkar,
	Ritesh Harjani

Instead of open coding it, use in_range() function instead.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/fast_commit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 5934c23e153e..bd6a47d18716 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1874,8 +1874,8 @@ bool ext4_fc_replay_check_excluded(struct super_block *sb, ext4_fsblk_t blk)
 		if (state->fc_regions[i].ino == 0 ||
 			state->fc_regions[i].len == 0)
 			continue;
-		if (blk >= state->fc_regions[i].pblk &&
-		    blk < state->fc_regions[i].pblk + state->fc_regions[i].len)
+		if (in_range(blk, state->fc_regions[i].pblk,
+					state->fc_regions[i].len))
 			return true;
 	}
 	return false;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC 4/6] ext4: No need to test for block bitmap bits in ext4_mb_mark_bb()
       [not found] <cover.1643642105.git.riteshh@linux.ibm.com>
                   ` (2 preceding siblings ...)
  2022-01-31 15:16 ` [RFC 3/6] ext4: Use in_range() for range checking in ext4_fc_replay_check_excluded Ritesh Harjani
@ 2022-01-31 15:16 ` Ritesh Harjani
  2022-02-01 11:38   ` Jan Kara
  2022-01-31 15:16 ` [RFC 5/6] ext4: Refactor ext4_free_blocks() to pull out ext4_mb_clear_bb() Ritesh Harjani
  2022-01-31 15:16 ` [RFC 6/6] ext4: Add extra check in ext4_mb_mark_bb() to prevent against possible corruption Ritesh Harjani
  5 siblings, 1 reply; 18+ messages in thread
From: Ritesh Harjani @ 2022-01-31 15:16 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, Theodore Ts'o, Jan Kara, Harshad Shirwadkar,
	Ritesh Harjani

We don't need the return value of mb_test_and_clear_bits() in ext4_mb_mark_bb()
So simply use mb_clear_bits() instead.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/mballoc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 60d32d3d8dc4..2f931575e6c2 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3943,7 +3943,7 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
 	if (state)
 		ext4_set_bits(bitmap_bh->b_data, blkoff, clen);
 	else
-		mb_test_and_clear_bits(bitmap_bh->b_data, blkoff, clen);
+		mb_clear_bits(bitmap_bh->b_data, blkoff, clen);
 	if (ext4_has_group_desc_csum(sb) &&
 	    (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) {
 		gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC 5/6] ext4: Refactor ext4_free_blocks() to pull out ext4_mb_clear_bb()
       [not found] <cover.1643642105.git.riteshh@linux.ibm.com>
                   ` (3 preceding siblings ...)
  2022-01-31 15:16 ` [RFC 4/6] ext4: No need to test for block bitmap bits in ext4_mb_mark_bb() Ritesh Harjani
@ 2022-01-31 15:16 ` Ritesh Harjani
  2022-02-01 11:40   ` Jan Kara
  2022-01-31 15:16 ` [RFC 6/6] ext4: Add extra check in ext4_mb_mark_bb() to prevent against possible corruption Ritesh Harjani
  5 siblings, 1 reply; 18+ messages in thread
From: Ritesh Harjani @ 2022-01-31 15:16 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, Theodore Ts'o, Jan Kara, Harshad Shirwadkar,
	Ritesh Harjani

ext4_free_blocks() function became too long and confusing, this patch
just pulls out the ext4_mb_clear_bb() function logic from it
which clears the block bitmap and frees it.

No functionality change in this patch

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/mballoc.c | 180 ++++++++++++++++++++++++++--------------------
 1 file changed, 102 insertions(+), 78 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 2f931575e6c2..5f20e355d08c 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -5870,7 +5870,8 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
 }
 
 /**
- * ext4_free_blocks() -- Free given blocks and update quota
+ * ext4_mb_clear_bb() -- helper function for freeing blocks.
+ * 			Used by ext4_free_blocks()
  * @handle:		handle for this transaction
  * @inode:		inode
  * @bh:			optional buffer of the block to be freed
@@ -5878,9 +5879,9 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
  * @count:		number of blocks to be freed
  * @flags:		flags used by ext4_free_blocks
  */
-void ext4_free_blocks(handle_t *handle, struct inode *inode,
-		      struct buffer_head *bh, ext4_fsblk_t block,
-		      unsigned long count, int flags)
+static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode,
+			       ext4_fsblk_t block, unsigned long count,
+			       int flags)
 {
 	struct buffer_head *bitmap_bh = NULL;
 	struct super_block *sb = inode->i_sb;
@@ -5897,80 +5898,6 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 
 	sbi = EXT4_SB(sb);
 
-	if (sbi->s_mount_state & EXT4_FC_REPLAY) {
-		ext4_free_blocks_simple(inode, block, count);
-		return;
-	}
-
-	might_sleep();
-	if (bh) {
-		if (block)
-			BUG_ON(block != bh->b_blocknr);
-		else
-			block = bh->b_blocknr;
-	}
-
-	if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) &&
-	    !ext4_inode_block_valid(inode, block, count)) {
-		ext4_error(sb, "Freeing blocks not in datazone - "
-			   "block = %llu, count = %lu", block, count);
-		goto error_return;
-	}
-
-	ext4_debug("freeing block %llu\n", block);
-	trace_ext4_free_blocks(inode, block, count, flags);
-
-	if (bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
-		BUG_ON(count > 1);
-
-		ext4_forget(handle, flags & EXT4_FREE_BLOCKS_METADATA,
-			    inode, bh, block);
-	}
-
-	/*
-	 * If the extent to be freed does not begin on a cluster
-	 * boundary, we need to deal with partial clusters at the
-	 * beginning and end of the extent.  Normally we will free
-	 * blocks at the beginning or the end unless we are explicitly
-	 * requested to avoid doing so.
-	 */
-	overflow = EXT4_PBLK_COFF(sbi, block);
-	if (overflow) {
-		if (flags & EXT4_FREE_BLOCKS_NOFREE_FIRST_CLUSTER) {
-			overflow = sbi->s_cluster_ratio - overflow;
-			block += overflow;
-			if (count > overflow)
-				count -= overflow;
-			else
-				return;
-		} else {
-			block -= overflow;
-			count += overflow;
-		}
-	}
-	overflow = EXT4_LBLK_COFF(sbi, count);
-	if (overflow) {
-		if (flags & EXT4_FREE_BLOCKS_NOFREE_LAST_CLUSTER) {
-			if (count > overflow)
-				count -= overflow;
-			else
-				return;
-		} else
-			count += sbi->s_cluster_ratio - overflow;
-	}
-
-	if (!bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
-		int i;
-		int is_metadata = flags & EXT4_FREE_BLOCKS_METADATA;
-
-		for (i = 0; i < count; i++) {
-			cond_resched();
-			if (is_metadata)
-				bh = sb_find_get_block(inode->i_sb, block + i);
-			ext4_forget(handle, is_metadata, inode, bh, block + i);
-		}
-	}
-
 do_more:
 	overflow = 0;
 	ext4_get_group_no_and_offset(sb, block, &block_group, &bit);
@@ -6132,6 +6059,103 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 	return;
 }
 
+/**
+ * ext4_free_blocks() -- Free given blocks and update quota
+ * @handle:		handle for this transaction
+ * @inode:		inode
+ * @bh:			optional buffer of the block to be freed
+ * @block:		starting physical block to be freed
+ * @count:		number of blocks to be freed
+ * @flags:		flags used by ext4_free_blocks
+ */
+void ext4_free_blocks(handle_t *handle, struct inode *inode,
+		      struct buffer_head *bh, ext4_fsblk_t block,
+		      unsigned long count, int flags)
+{
+	struct super_block *sb = inode->i_sb;
+	unsigned int overflow;
+	struct ext4_sb_info *sbi;
+
+	sbi = EXT4_SB(sb);
+
+	if (sbi->s_mount_state & EXT4_FC_REPLAY) {
+		ext4_free_blocks_simple(inode, block, count);
+		return;
+	}
+
+	might_sleep();
+	if (bh) {
+		if (block)
+			BUG_ON(block != bh->b_blocknr);
+		else
+			block = bh->b_blocknr;
+	}
+
+	if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) &&
+	    !ext4_inode_block_valid(inode, block, count)) {
+		ext4_error(sb, "Freeing blocks not in datazone - "
+			   "block = %llu, count = %lu", block, count);
+		return;
+	}
+
+	ext4_debug("freeing block %llu\n", block);
+	trace_ext4_free_blocks(inode, block, count, flags);
+
+	if (bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
+		BUG_ON(count > 1);
+
+		ext4_forget(handle, flags & EXT4_FREE_BLOCKS_METADATA,
+			    inode, bh, block);
+	}
+
+	/*
+	 * If the extent to be freed does not begin on a cluster
+	 * boundary, we need to deal with partial clusters at the
+	 * beginning and end of the extent.  Normally we will free
+	 * blocks at the beginning or the end unless we are explicitly
+	 * requested to avoid doing so.
+	 */
+	overflow = EXT4_PBLK_COFF(sbi, block);
+	if (overflow) {
+		if (flags & EXT4_FREE_BLOCKS_NOFREE_FIRST_CLUSTER) {
+			overflow = sbi->s_cluster_ratio - overflow;
+			block += overflow;
+			if (count > overflow)
+				count -= overflow;
+			else
+				return;
+		} else {
+			block -= overflow;
+			count += overflow;
+		}
+	}
+	overflow = EXT4_LBLK_COFF(sbi, count);
+	if (overflow) {
+		if (flags & EXT4_FREE_BLOCKS_NOFREE_LAST_CLUSTER) {
+			if (count > overflow)
+				count -= overflow;
+			else
+				return;
+		} else
+			count += sbi->s_cluster_ratio - overflow;
+	}
+
+	if (!bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
+		int i;
+		int is_metadata = flags & EXT4_FREE_BLOCKS_METADATA;
+
+		for (i = 0; i < count; i++) {
+			cond_resched();
+			if (is_metadata)
+				bh = sb_find_get_block(inode->i_sb, block + i);
+			ext4_forget(handle, is_metadata, inode, bh, block + i);
+		}
+	}
+
+	ext4_mb_clear_bb(handle, inode, block, count, flags);
+	return;
+}
+
 /**
  * ext4_group_add_blocks() -- Add given blocks to an existing group
  * @handle:			handle to this transaction
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC 6/6] ext4: Add extra check in ext4_mb_mark_bb() to prevent against possible corruption
       [not found] <cover.1643642105.git.riteshh@linux.ibm.com>
                   ` (4 preceding siblings ...)
  2022-01-31 15:16 ` [RFC 5/6] ext4: Refactor ext4_free_blocks() to pull out ext4_mb_clear_bb() Ritesh Harjani
@ 2022-01-31 15:16 ` Ritesh Harjani
  2022-02-01 11:47   ` Jan Kara
  5 siblings, 1 reply; 18+ messages in thread
From: Ritesh Harjani @ 2022-01-31 15:16 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, Theodore Ts'o, Jan Kara, Harshad Shirwadkar,
	Ritesh Harjani

This patch adds an extra checks in ext4_mb_mark_bb() function
to make sure we mark & report error if we were to mark/clear any
of the critical FS metadata specific bitmaps (&bail out) to prevent
from any accidental corruption.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/mballoc.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 5f20e355d08c..c94888534caa 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3920,6 +3920,13 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
 		len -= overflow;
 	}
 
+	if (!ext4_group_block_valid(sb, group, block, len)) {
+		ext4_error(sb, "Marking blocks in system zone - "
+			   "Block = %llu, len = %d", block, len);
+		bitmap_bh = NULL;
+		goto out_err;
+	}
+
 	clen = EXT4_NUM_B2C(sbi, len);
 
 	bitmap_bh = ext4_read_block_bitmap(sb, group);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC 1/6] ext4: Fixes ext4_mb_mark_bb() with flex_bg with fast_commit
  2022-01-31 15:16 ` [RFC 1/6] ext4: Fixes ext4_mb_mark_bb() with flex_bg with fast_commit Ritesh Harjani
@ 2022-02-01 11:21   ` Jan Kara
  2022-02-04 10:12     ` Ritesh Harjani
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kara @ 2022-02-01 11:21 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Jan Kara,
	Harshad Shirwadkar

On Mon 31-01-22 20:46:50, Ritesh Harjani wrote:
> In case of flex_bg feature (which is by default enabled), extents for
> any given inode might span across blocks from two different block group.
> ext4_mb_mark_bb() only reads the buffer_head of block bitmap once for the
> starting block group, but it fails to read it again when the extent length
> boundary overflows to another block group. Then in this below loop it
> accesses memory beyond the block group bitmap buffer_head and results
> into a data abort.
> 
> 	for (i = 0; i < clen; i++)
> 		if (!mb_test_bit(blkoff + i, bitmap_bh->b_data) == !state)
> 			already++;
> 
> This patch adds this functionality for checking block group boundary in
> ext4_mb_mark_bb() and update the buffer_head(bitmap_bh) for every different
> block group.
> 
> w/o this patch, I was easily able to hit a data access abort using Power platform.
> 
> <...>
> [   74.327662] EXT4-fs error (device loop3): ext4_mb_generate_buddy:1141: group 11, block bitmap and bg descriptor inconsistent: 21248 vs 23294 free clusters
> [   74.533214] EXT4-fs (loop3): shut down requested (2)
> [   74.536705] Aborting journal on device loop3-8.
> [   74.702705] BUG: Unable to handle kernel data access on read at 0xc00000005e980000
> [   74.703727] Faulting instruction address: 0xc0000000007bffb8
> cpu 0xd: Vector: 300 (Data Access) at [c000000015db7060]
>     pc: c0000000007bffb8: ext4_mb_mark_bb+0x198/0x5a0
>     lr: c0000000007bfeec: ext4_mb_mark_bb+0xcc/0x5a0
>     sp: c000000015db7300
>    msr: 800000000280b033
>    dar: c00000005e980000
>  dsisr: 40000000
>   current = 0xc000000027af6880
>   paca    = 0xc00000003ffd5200   irqmask: 0x03   irq_happened: 0x01
>     pid   = 5167, comm = mount
> <...>
> enter ? for help
> [c000000015db7380] c000000000782708 ext4_ext_clear_bb+0x378/0x410
> [c000000015db7400] c000000000813f14 ext4_fc_replay+0x1794/0x2000
> [c000000015db7580] c000000000833f7c do_one_pass+0xe9c/0x12a0
> [c000000015db7710] c000000000834504 jbd2_journal_recover+0x184/0x2d0
> [c000000015db77c0] c000000000841398 jbd2_journal_load+0x188/0x4a0
> [c000000015db7880] c000000000804de8 ext4_fill_super+0x2638/0x3e10
> [c000000015db7a40] c0000000005f8404 get_tree_bdev+0x2b4/0x350
> [c000000015db7ae0] c0000000007ef058 ext4_get_tree+0x28/0x40
> [c000000015db7b00] c0000000005f6344 vfs_get_tree+0x44/0x100
> [c000000015db7b70] c00000000063c408 path_mount+0xdd8/0xe70
> [c000000015db7c40] c00000000063c8f0 sys_mount+0x450/0x550
> [c000000015db7d50] c000000000035770 system_call_exception+0x4a0/0x4e0
> [c000000015db7e10] c00000000000c74c system_call_common+0xec/0x250
> --- Exception: c00 (System Call) at 00007ffff7dbfaa4
> 
> Fixes: 8016e29f4362e28 ("ext4: fast commit recovery path")
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ---
>  fs/ext4/mballoc.c | 30 +++++++++++++++++++++++++++---
>  1 file changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index c781974df9d0..8d23108cf9d7 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3899,12 +3899,29 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
>  	struct ext4_sb_info *sbi = EXT4_SB(sb);
>  	ext4_group_t group;
>  	ext4_grpblk_t blkoff;
> -	int i, clen, err;
> +	int i, err;
>  	int already;
> +	unsigned int clen, overflow;
>  
> -	clen = EXT4_B2C(sbi, len);
> -
> +again:

And maybe structure this as a while loop? Like:

	while (len > 0) {
		...
	}

> +	overflow = 0;
>  	ext4_get_group_no_and_offset(sb, block, &group, &blkoff);
> +
> +	/*
> +	 * Check to see if we are freeing blocks across a group
> +	 * boundary.
> +	 * In case of flex_bg, this can happen that (block, len) may span across
> +	 * more than one group. In that case we need to get the corresponding
> +	 * group metadata to work with. For this we have goto again loop.
> +	 */
> +	if (EXT4_C2B(sbi, blkoff) + len > EXT4_BLOCKS_PER_GROUP(sb)) {
> +		overflow = EXT4_C2B(sbi, blkoff) + len -
> +			EXT4_BLOCKS_PER_GROUP(sb);
> +		len -= overflow;

Why not just:

	thisgrp_len = min_t(int, len,
			EXT4_BLOCKS_PER_GROUP(sb) - EXT4_C2B(sbi, blkoff));
	clen = EXT4_NUM_B2C(sbi, thisgrp_len);

It seems easier to understand to me.

								Honza

> +	}
> +
> +	clen = EXT4_NUM_B2C(sbi, len);
> +
>  	bitmap_bh = ext4_read_block_bitmap(sb, group);
>  	if (IS_ERR(bitmap_bh)) {
>  		err = PTR_ERR(bitmap_bh);
> @@ -3960,6 +3977,13 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
>  	err = ext4_handle_dirty_metadata(NULL, NULL, gdp_bh);
>  	sync_dirty_buffer(gdp_bh);
>  
> +	if (overflow && !err) {
> +		block += len;
> +		len = overflow;
> +		put_bh(bitmap_bh);
> +		goto again;
> +	}
> +
>  out_err:
>  	brelse(bitmap_bh);
>  }
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function
  2022-01-31 15:16 ` [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function Ritesh Harjani
@ 2022-02-01 11:34   ` Jan Kara
  2022-02-04 10:08     ` Ritesh Harjani
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kara @ 2022-02-01 11:34 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Jan Kara,
	Harshad Shirwadkar

On Mon 31-01-22 20:46:51, Ritesh Harjani wrote:
> This patch implements ext4_group_block_valid() check functionality,
> and refactors all the callers to use this common function instead.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
...

> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 8d23108cf9d7..60d32d3d8dc4 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -6001,13 +6001,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
>  		goto error_return;
>  	}
>  
> -	if (in_range(ext4_block_bitmap(sb, gdp), block, count) ||
> -	    in_range(ext4_inode_bitmap(sb, gdp), block, count) ||
> -	    in_range(block, ext4_inode_table(sb, gdp),
> -		     sbi->s_itb_per_group) ||
> -	    in_range(block + count - 1, ext4_inode_table(sb, gdp),
> -		     sbi->s_itb_per_group)) {
> -
> +	if (!ext4_group_block_valid(sb, block_group, block, count)) {
>  		ext4_error(sb, "Freeing blocks in system zone - "
>  			   "Block = %llu, count = %lu", block, count);
>  		/* err = 0. ext4_std_error should be a no op */

When doing this, why not rather directly use ext4_inode_block_valid() here?

> @@ -6194,11 +6188,7 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
>  		goto error_return;
>  	}
>  
> -	if (in_range(ext4_block_bitmap(sb, desc), block, count) ||
> -	    in_range(ext4_inode_bitmap(sb, desc), block, count) ||
> -	    in_range(block, ext4_inode_table(sb, desc), sbi->s_itb_per_group) ||
> -	    in_range(block + count - 1, ext4_inode_table(sb, desc),
> -		     sbi->s_itb_per_group)) {
> +	if (!ext4_group_block_valid(sb, block_group, block, count)) {
>  		ext4_error(sb, "Adding blocks in system zones - "
>  			   "Block = %llu, count = %lu",
>  			   block, count);

And here I'd rather refactor ext4_inode_block_valid() a bit to provide a
more generic helper not requiring an inode and use it here...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 3/6] ext4: Use in_range() for range checking in ext4_fc_replay_check_excluded
  2022-01-31 15:16 ` [RFC 3/6] ext4: Use in_range() for range checking in ext4_fc_replay_check_excluded Ritesh Harjani
@ 2022-02-01 11:35   ` Jan Kara
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Kara @ 2022-02-01 11:35 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Jan Kara,
	Harshad Shirwadkar

On Mon 31-01-22 20:46:52, Ritesh Harjani wrote:
> Instead of open coding it, use in_range() function instead.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/fast_commit.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 5934c23e153e..bd6a47d18716 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1874,8 +1874,8 @@ bool ext4_fc_replay_check_excluded(struct super_block *sb, ext4_fsblk_t blk)
>  		if (state->fc_regions[i].ino == 0 ||
>  			state->fc_regions[i].len == 0)
>  			continue;
> -		if (blk >= state->fc_regions[i].pblk &&
> -		    blk < state->fc_regions[i].pblk + state->fc_regions[i].len)
> +		if (in_range(blk, state->fc_regions[i].pblk,
> +					state->fc_regions[i].len))
>  			return true;
>  	}
>  	return false;
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 4/6] ext4: No need to test for block bitmap bits in ext4_mb_mark_bb()
  2022-01-31 15:16 ` [RFC 4/6] ext4: No need to test for block bitmap bits in ext4_mb_mark_bb() Ritesh Harjani
@ 2022-02-01 11:38   ` Jan Kara
  2022-02-04 10:10     ` Ritesh Harjani
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kara @ 2022-02-01 11:38 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Jan Kara,
	Harshad Shirwadkar

On Mon 31-01-22 20:46:53, Ritesh Harjani wrote:
> We don't need the return value of mb_test_and_clear_bits() in ext4_mb_mark_bb()
> So simply use mb_clear_bits() instead.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good. I'm rather confused by ext4_set_bits() vs mb_clear_bits()
asymetry but that's not directly related to this patch. Just another
cleanup to do. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/mballoc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 60d32d3d8dc4..2f931575e6c2 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3943,7 +3943,7 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
>  	if (state)
>  		ext4_set_bits(bitmap_bh->b_data, blkoff, clen);
>  	else
> -		mb_test_and_clear_bits(bitmap_bh->b_data, blkoff, clen);
> +		mb_clear_bits(bitmap_bh->b_data, blkoff, clen);
>  	if (ext4_has_group_desc_csum(sb) &&
>  	    (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) {
>  		gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT);
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 5/6] ext4: Refactor ext4_free_blocks() to pull out ext4_mb_clear_bb()
  2022-01-31 15:16 ` [RFC 5/6] ext4: Refactor ext4_free_blocks() to pull out ext4_mb_clear_bb() Ritesh Harjani
@ 2022-02-01 11:40   ` Jan Kara
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Kara @ 2022-02-01 11:40 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Jan Kara,
	Harshad Shirwadkar

On Mon 31-01-22 20:46:54, Ritesh Harjani wrote:
> ext4_free_blocks() function became too long and confusing, this patch
> just pulls out the ext4_mb_clear_bb() function logic from it
> which clears the block bitmap and frees it.
> 
> No functionality change in this patch
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Yeah, the function was rather long. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/mballoc.c | 180 ++++++++++++++++++++++++++--------------------
>  1 file changed, 102 insertions(+), 78 deletions(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 2f931575e6c2..5f20e355d08c 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -5870,7 +5870,8 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
>  }
>  
>  /**
> - * ext4_free_blocks() -- Free given blocks and update quota
> + * ext4_mb_clear_bb() -- helper function for freeing blocks.
> + * 			Used by ext4_free_blocks()
>   * @handle:		handle for this transaction
>   * @inode:		inode
>   * @bh:			optional buffer of the block to be freed
> @@ -5878,9 +5879,9 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
>   * @count:		number of blocks to be freed
>   * @flags:		flags used by ext4_free_blocks
>   */
> -void ext4_free_blocks(handle_t *handle, struct inode *inode,
> -		      struct buffer_head *bh, ext4_fsblk_t block,
> -		      unsigned long count, int flags)
> +static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode,
> +			       ext4_fsblk_t block, unsigned long count,
> +			       int flags)
>  {
>  	struct buffer_head *bitmap_bh = NULL;
>  	struct super_block *sb = inode->i_sb;
> @@ -5897,80 +5898,6 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
>  
>  	sbi = EXT4_SB(sb);
>  
> -	if (sbi->s_mount_state & EXT4_FC_REPLAY) {
> -		ext4_free_blocks_simple(inode, block, count);
> -		return;
> -	}
> -
> -	might_sleep();
> -	if (bh) {
> -		if (block)
> -			BUG_ON(block != bh->b_blocknr);
> -		else
> -			block = bh->b_blocknr;
> -	}
> -
> -	if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) &&
> -	    !ext4_inode_block_valid(inode, block, count)) {
> -		ext4_error(sb, "Freeing blocks not in datazone - "
> -			   "block = %llu, count = %lu", block, count);
> -		goto error_return;
> -	}
> -
> -	ext4_debug("freeing block %llu\n", block);
> -	trace_ext4_free_blocks(inode, block, count, flags);
> -
> -	if (bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
> -		BUG_ON(count > 1);
> -
> -		ext4_forget(handle, flags & EXT4_FREE_BLOCKS_METADATA,
> -			    inode, bh, block);
> -	}
> -
> -	/*
> -	 * If the extent to be freed does not begin on a cluster
> -	 * boundary, we need to deal with partial clusters at the
> -	 * beginning and end of the extent.  Normally we will free
> -	 * blocks at the beginning or the end unless we are explicitly
> -	 * requested to avoid doing so.
> -	 */
> -	overflow = EXT4_PBLK_COFF(sbi, block);
> -	if (overflow) {
> -		if (flags & EXT4_FREE_BLOCKS_NOFREE_FIRST_CLUSTER) {
> -			overflow = sbi->s_cluster_ratio - overflow;
> -			block += overflow;
> -			if (count > overflow)
> -				count -= overflow;
> -			else
> -				return;
> -		} else {
> -			block -= overflow;
> -			count += overflow;
> -		}
> -	}
> -	overflow = EXT4_LBLK_COFF(sbi, count);
> -	if (overflow) {
> -		if (flags & EXT4_FREE_BLOCKS_NOFREE_LAST_CLUSTER) {
> -			if (count > overflow)
> -				count -= overflow;
> -			else
> -				return;
> -		} else
> -			count += sbi->s_cluster_ratio - overflow;
> -	}
> -
> -	if (!bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
> -		int i;
> -		int is_metadata = flags & EXT4_FREE_BLOCKS_METADATA;
> -
> -		for (i = 0; i < count; i++) {
> -			cond_resched();
> -			if (is_metadata)
> -				bh = sb_find_get_block(inode->i_sb, block + i);
> -			ext4_forget(handle, is_metadata, inode, bh, block + i);
> -		}
> -	}
> -
>  do_more:
>  	overflow = 0;
>  	ext4_get_group_no_and_offset(sb, block, &block_group, &bit);
> @@ -6132,6 +6059,103 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
>  	return;
>  }
>  
> +/**
> + * ext4_free_blocks() -- Free given blocks and update quota
> + * @handle:		handle for this transaction
> + * @inode:		inode
> + * @bh:			optional buffer of the block to be freed
> + * @block:		starting physical block to be freed
> + * @count:		number of blocks to be freed
> + * @flags:		flags used by ext4_free_blocks
> + */
> +void ext4_free_blocks(handle_t *handle, struct inode *inode,
> +		      struct buffer_head *bh, ext4_fsblk_t block,
> +		      unsigned long count, int flags)
> +{
> +	struct super_block *sb = inode->i_sb;
> +	unsigned int overflow;
> +	struct ext4_sb_info *sbi;
> +
> +	sbi = EXT4_SB(sb);
> +
> +	if (sbi->s_mount_state & EXT4_FC_REPLAY) {
> +		ext4_free_blocks_simple(inode, block, count);
> +		return;
> +	}
> +
> +	might_sleep();
> +	if (bh) {
> +		if (block)
> +			BUG_ON(block != bh->b_blocknr);
> +		else
> +			block = bh->b_blocknr;
> +	}
> +
> +	if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) &&
> +	    !ext4_inode_block_valid(inode, block, count)) {
> +		ext4_error(sb, "Freeing blocks not in datazone - "
> +			   "block = %llu, count = %lu", block, count);
> +		return;
> +	}
> +
> +	ext4_debug("freeing block %llu\n", block);
> +	trace_ext4_free_blocks(inode, block, count, flags);
> +
> +	if (bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
> +		BUG_ON(count > 1);
> +
> +		ext4_forget(handle, flags & EXT4_FREE_BLOCKS_METADATA,
> +			    inode, bh, block);
> +	}
> +
> +	/*
> +	 * If the extent to be freed does not begin on a cluster
> +	 * boundary, we need to deal with partial clusters at the
> +	 * beginning and end of the extent.  Normally we will free
> +	 * blocks at the beginning or the end unless we are explicitly
> +	 * requested to avoid doing so.
> +	 */
> +	overflow = EXT4_PBLK_COFF(sbi, block);
> +	if (overflow) {
> +		if (flags & EXT4_FREE_BLOCKS_NOFREE_FIRST_CLUSTER) {
> +			overflow = sbi->s_cluster_ratio - overflow;
> +			block += overflow;
> +			if (count > overflow)
> +				count -= overflow;
> +			else
> +				return;
> +		} else {
> +			block -= overflow;
> +			count += overflow;
> +		}
> +	}
> +	overflow = EXT4_LBLK_COFF(sbi, count);
> +	if (overflow) {
> +		if (flags & EXT4_FREE_BLOCKS_NOFREE_LAST_CLUSTER) {
> +			if (count > overflow)
> +				count -= overflow;
> +			else
> +				return;
> +		} else
> +			count += sbi->s_cluster_ratio - overflow;
> +	}
> +
> +	if (!bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
> +		int i;
> +		int is_metadata = flags & EXT4_FREE_BLOCKS_METADATA;
> +
> +		for (i = 0; i < count; i++) {
> +			cond_resched();
> +			if (is_metadata)
> +				bh = sb_find_get_block(inode->i_sb, block + i);
> +			ext4_forget(handle, is_metadata, inode, bh, block + i);
> +		}
> +	}
> +
> +	ext4_mb_clear_bb(handle, inode, block, count, flags);
> +	return;
> +}
> +
>  /**
>   * ext4_group_add_blocks() -- Add given blocks to an existing group
>   * @handle:			handle to this transaction
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 6/6] ext4: Add extra check in ext4_mb_mark_bb() to prevent against possible corruption
  2022-01-31 15:16 ` [RFC 6/6] ext4: Add extra check in ext4_mb_mark_bb() to prevent against possible corruption Ritesh Harjani
@ 2022-02-01 11:47   ` Jan Kara
  2022-02-04 10:11     ` Ritesh Harjani
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kara @ 2022-02-01 11:47 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Jan Kara,
	Harshad Shirwadkar

On Mon 31-01-22 20:46:55, Ritesh Harjani wrote:
> This patch adds an extra checks in ext4_mb_mark_bb() function
> to make sure we mark & report error if we were to mark/clear any
> of the critical FS metadata specific bitmaps (&bail out) to prevent
> from any accidental corruption.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Again please rather use ext4_inode_block_valid() here. All the callers of
ext4_mb_mark_bb() have the information available.

								Honza

> ---
>  fs/ext4/mballoc.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 5f20e355d08c..c94888534caa 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3920,6 +3920,13 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
>  		len -= overflow;
>  	}
>  
> +	if (!ext4_group_block_valid(sb, group, block, len)) {
> +		ext4_error(sb, "Marking blocks in system zone - "
> +			   "Block = %llu, len = %d", block, len);
> +		bitmap_bh = NULL;
> +		goto out_err;
> +	}
> +
>  	clen = EXT4_NUM_B2C(sbi, len);
>  
>  	bitmap_bh = ext4_read_block_bitmap(sb, group);
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function
  2022-02-01 11:34   ` Jan Kara
@ 2022-02-04 10:08     ` Ritesh Harjani
  2022-02-04 11:49       ` Jan Kara
  0 siblings, 1 reply; 18+ messages in thread
From: Ritesh Harjani @ 2022-02-04 10:08 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Harshad Shirwadkar

On 22/02/01 12:34PM, Jan Kara wrote:
> On Mon 31-01-22 20:46:51, Ritesh Harjani wrote:
> > This patch implements ext4_group_block_valid() check functionality,
> > and refactors all the callers to use this common function instead.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ...
>
> > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> > index 8d23108cf9d7..60d32d3d8dc4 100644
> > --- a/fs/ext4/mballoc.c
> > +++ b/fs/ext4/mballoc.c
> > @@ -6001,13 +6001,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
> >  		goto error_return;
> >  	}
> >
> > -	if (in_range(ext4_block_bitmap(sb, gdp), block, count) ||
> > -	    in_range(ext4_inode_bitmap(sb, gdp), block, count) ||
> > -	    in_range(block, ext4_inode_table(sb, gdp),
> > -		     sbi->s_itb_per_group) ||
> > -	    in_range(block + count - 1, ext4_inode_table(sb, gdp),
> > -		     sbi->s_itb_per_group)) {
> > -
> > +	if (!ext4_group_block_valid(sb, block_group, block, count)) {
> >  		ext4_error(sb, "Freeing blocks in system zone - "
> >  			   "Block = %llu, count = %lu", block, count);
> >  		/* err = 0. ext4_std_error should be a no op */
>
> When doing this, why not rather directly use ext4_inode_block_valid() here?

This is because while freeing these blocks we have their's corresponding block
group too. So there is little point in checking FS Metadata of all block groups
v/s FS Metadata of just this block group, no?

Also, I am not sure if we changing this to check against system-zone's blocks
(which has FS Metadata blocks from all block groups), can add any additional
penalty?

-riteshh

>
> > @@ -6194,11 +6188,7 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
> >  		goto error_return;
> >  	}
> >
> > -	if (in_range(ext4_block_bitmap(sb, desc), block, count) ||
> > -	    in_range(ext4_inode_bitmap(sb, desc), block, count) ||
> > -	    in_range(block, ext4_inode_table(sb, desc), sbi->s_itb_per_group) ||
> > -	    in_range(block + count - 1, ext4_inode_table(sb, desc),
> > -		     sbi->s_itb_per_group)) {
> > +	if (!ext4_group_block_valid(sb, block_group, block, count)) {
> >  		ext4_error(sb, "Adding blocks in system zones - "
> >  			   "Block = %llu, count = %lu",
> >  			   block, count);
>
> And here I'd rather refactor ext4_inode_block_valid() a bit to provide a
> more generic helper not requiring an inode and use it here...
>
> 								Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 4/6] ext4: No need to test for block bitmap bits in ext4_mb_mark_bb()
  2022-02-01 11:38   ` Jan Kara
@ 2022-02-04 10:10     ` Ritesh Harjani
  0 siblings, 0 replies; 18+ messages in thread
From: Ritesh Harjani @ 2022-02-04 10:10 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Harshad Shirwadkar

On 22/02/01 12:38PM, Jan Kara wrote:
> On Mon 31-01-22 20:46:53, Ritesh Harjani wrote:
> > We don't need the return value of mb_test_and_clear_bits() in ext4_mb_mark_bb()
> > So simply use mb_clear_bits() instead.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Looks good. I'm rather confused by ext4_set_bits() vs mb_clear_bits()
> asymetry but that's not directly related to this patch. Just another
> cleanup to do. Feel free to add:

Yes, make sense. Looking at ext4_set_bits(), I think it should be renamed to
mb_set_bits() for uniform API conventions.

>
> Reviewed-by: Jan Kara <jack@suse.cz>
>

Thanks :)

> 								Honza
>
> > ---
> >  fs/ext4/mballoc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> > index 60d32d3d8dc4..2f931575e6c2 100644
> > --- a/fs/ext4/mballoc.c
> > +++ b/fs/ext4/mballoc.c
> > @@ -3943,7 +3943,7 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
> >  	if (state)
> >  		ext4_set_bits(bitmap_bh->b_data, blkoff, clen);
> >  	else
> > -		mb_test_and_clear_bits(bitmap_bh->b_data, blkoff, clen);
> > +		mb_clear_bits(bitmap_bh->b_data, blkoff, clen);
> >  	if (ext4_has_group_desc_csum(sb) &&
> >  	    (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) {
> >  		gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT);
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 6/6] ext4: Add extra check in ext4_mb_mark_bb() to prevent against possible corruption
  2022-02-01 11:47   ` Jan Kara
@ 2022-02-04 10:11     ` Ritesh Harjani
  0 siblings, 0 replies; 18+ messages in thread
From: Ritesh Harjani @ 2022-02-04 10:11 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Harshad Shirwadkar

On 22/02/01 12:47PM, Jan Kara wrote:
> On Mon 31-01-22 20:46:55, Ritesh Harjani wrote:
> > This patch adds an extra checks in ext4_mb_mark_bb() function
> > to make sure we mark & report error if we were to mark/clear any
> > of the critical FS metadata specific bitmaps (&bail out) to prevent
> > from any accidental corruption.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Again please rather use ext4_inode_block_valid() here. All the callers of
> ext4_mb_mark_bb() have the information available.
>

Same reason here too, since we are already aware of the block group these blocks
belong too, does it make any sense to check against the system-zone in that
case?

-ritesh


> 								Honza
>
> > ---
> >  fs/ext4/mballoc.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> > index 5f20e355d08c..c94888534caa 100644
> > --- a/fs/ext4/mballoc.c
> > +++ b/fs/ext4/mballoc.c
> > @@ -3920,6 +3920,13 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
> >  		len -= overflow;
> >  	}
> >
> > +	if (!ext4_group_block_valid(sb, group, block, len)) {
> > +		ext4_error(sb, "Marking blocks in system zone - "
> > +			   "Block = %llu, len = %d", block, len);
> > +		bitmap_bh = NULL;
> > +		goto out_err;
> > +	}
> > +
> >  	clen = EXT4_NUM_B2C(sbi, len);
> >
> >  	bitmap_bh = ext4_read_block_bitmap(sb, group);
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 1/6] ext4: Fixes ext4_mb_mark_bb() with flex_bg with fast_commit
  2022-02-01 11:21   ` Jan Kara
@ 2022-02-04 10:12     ` Ritesh Harjani
  0 siblings, 0 replies; 18+ messages in thread
From: Ritesh Harjani @ 2022-02-04 10:12 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Harshad Shirwadkar

On 22/02/01 12:21PM, Jan Kara wrote:
> On Mon 31-01-22 20:46:50, Ritesh Harjani wrote:
> > In case of flex_bg feature (which is by default enabled), extents for
> > any given inode might span across blocks from two different block group.
> > ext4_mb_mark_bb() only reads the buffer_head of block bitmap once for the
> > starting block group, but it fails to read it again when the extent length
> > boundary overflows to another block group. Then in this below loop it
> > accesses memory beyond the block group bitmap buffer_head and results
> > into a data abort.
> >
> > 	for (i = 0; i < clen; i++)
> > 		if (!mb_test_bit(blkoff + i, bitmap_bh->b_data) == !state)
> > 			already++;
> >
> > This patch adds this functionality for checking block group boundary in
> > ext4_mb_mark_bb() and update the buffer_head(bitmap_bh) for every different
> > block group.
> >
> > w/o this patch, I was easily able to hit a data access abort using Power platform.
> >
> > <...>
> > [   74.327662] EXT4-fs error (device loop3): ext4_mb_generate_buddy:1141: group 11, block bitmap and bg descriptor inconsistent: 21248 vs 23294 free clusters
> > [   74.533214] EXT4-fs (loop3): shut down requested (2)
> > [   74.536705] Aborting journal on device loop3-8.
> > [   74.702705] BUG: Unable to handle kernel data access on read at 0xc00000005e980000
> > [   74.703727] Faulting instruction address: 0xc0000000007bffb8
> > cpu 0xd: Vector: 300 (Data Access) at [c000000015db7060]
> >     pc: c0000000007bffb8: ext4_mb_mark_bb+0x198/0x5a0
> >     lr: c0000000007bfeec: ext4_mb_mark_bb+0xcc/0x5a0
> >     sp: c000000015db7300
> >    msr: 800000000280b033
> >    dar: c00000005e980000
> >  dsisr: 40000000
> >   current = 0xc000000027af6880
> >   paca    = 0xc00000003ffd5200   irqmask: 0x03   irq_happened: 0x01
> >     pid   = 5167, comm = mount
> > <...>
> > enter ? for help
> > [c000000015db7380] c000000000782708 ext4_ext_clear_bb+0x378/0x410
> > [c000000015db7400] c000000000813f14 ext4_fc_replay+0x1794/0x2000
> > [c000000015db7580] c000000000833f7c do_one_pass+0xe9c/0x12a0
> > [c000000015db7710] c000000000834504 jbd2_journal_recover+0x184/0x2d0
> > [c000000015db77c0] c000000000841398 jbd2_journal_load+0x188/0x4a0
> > [c000000015db7880] c000000000804de8 ext4_fill_super+0x2638/0x3e10
> > [c000000015db7a40] c0000000005f8404 get_tree_bdev+0x2b4/0x350
> > [c000000015db7ae0] c0000000007ef058 ext4_get_tree+0x28/0x40
> > [c000000015db7b00] c0000000005f6344 vfs_get_tree+0x44/0x100
> > [c000000015db7b70] c00000000063c408 path_mount+0xdd8/0xe70
> > [c000000015db7c40] c00000000063c8f0 sys_mount+0x450/0x550
> > [c000000015db7d50] c000000000035770 system_call_exception+0x4a0/0x4e0
> > [c000000015db7e10] c00000000000c74c system_call_common+0xec/0x250
> > --- Exception: c00 (System Call) at 00007ffff7dbfaa4
> >
> > Fixes: 8016e29f4362e28 ("ext4: fast commit recovery path")
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > ---
> >  fs/ext4/mballoc.c | 30 +++++++++++++++++++++++++++---
> >  1 file changed, 27 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> > index c781974df9d0..8d23108cf9d7 100644
> > --- a/fs/ext4/mballoc.c
> > +++ b/fs/ext4/mballoc.c
> > @@ -3899,12 +3899,29 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
> >  	struct ext4_sb_info *sbi = EXT4_SB(sb);
> >  	ext4_group_t group;
> >  	ext4_grpblk_t blkoff;
> > -	int i, clen, err;
> > +	int i, err;
> >  	int already;
> > +	unsigned int clen, overflow;
> >
> > -	clen = EXT4_B2C(sbi, len);
> > -
> > +again:
>
> And maybe structure this as a while loop? Like:
>
> 	while (len > 0) {
> 		...
> 	}

Sure, will check.

>
> > +	overflow = 0;
> >  	ext4_get_group_no_and_offset(sb, block, &group, &blkoff);
> > +
> > +	/*
> > +	 * Check to see if we are freeing blocks across a group
> > +	 * boundary.
> > +	 * In case of flex_bg, this can happen that (block, len) may span across
> > +	 * more than one group. In that case we need to get the corresponding
> > +	 * group metadata to work with. For this we have goto again loop.
> > +	 */
> > +	if (EXT4_C2B(sbi, blkoff) + len > EXT4_BLOCKS_PER_GROUP(sb)) {
> > +		overflow = EXT4_C2B(sbi, blkoff) + len -
> > +			EXT4_BLOCKS_PER_GROUP(sb);
> > +		len -= overflow;
>
> Why not just:
>
> 	thisgrp_len = min_t(int, len,
> 			EXT4_BLOCKS_PER_GROUP(sb) - EXT4_C2B(sbi, blkoff));
> 	clen = EXT4_NUM_B2C(sbi, thisgrp_len);
>
> It seems easier to understand to me.

Agree, will make this change.

-ritesh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function
  2022-02-04 10:08     ` Ritesh Harjani
@ 2022-02-04 11:49       ` Jan Kara
  2022-02-05 10:43         ` Ritesh Harjani
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kara @ 2022-02-04 11:49 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Jan Kara, linux-ext4, linux-fsdevel, Theodore Ts'o,
	Harshad Shirwadkar

On Fri 04-02-22 15:38:44, Ritesh Harjani wrote:
> On 22/02/01 12:34PM, Jan Kara wrote:
> > On Mon 31-01-22 20:46:51, Ritesh Harjani wrote:
> > > This patch implements ext4_group_block_valid() check functionality,
> > > and refactors all the callers to use this common function instead.
> > >
> > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > ...
> >
> > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> > > index 8d23108cf9d7..60d32d3d8dc4 100644
> > > --- a/fs/ext4/mballoc.c
> > > +++ b/fs/ext4/mballoc.c
> > > @@ -6001,13 +6001,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
> > >  		goto error_return;
> > >  	}
> > >
> > > -	if (in_range(ext4_block_bitmap(sb, gdp), block, count) ||
> > > -	    in_range(ext4_inode_bitmap(sb, gdp), block, count) ||
> > > -	    in_range(block, ext4_inode_table(sb, gdp),
> > > -		     sbi->s_itb_per_group) ||
> > > -	    in_range(block + count - 1, ext4_inode_table(sb, gdp),
> > > -		     sbi->s_itb_per_group)) {
> > > -
> > > +	if (!ext4_group_block_valid(sb, block_group, block, count)) {
> > >  		ext4_error(sb, "Freeing blocks in system zone - "
> > >  			   "Block = %llu, count = %lu", block, count);
> > >  		/* err = 0. ext4_std_error should be a no op */
> >
> > When doing this, why not rather directly use ext4_inode_block_valid() here?
> 
> This is because while freeing these blocks we have their's corresponding block
> group too. So there is little point in checking FS Metadata of all block groups
> v/s FS Metadata of just this block group, no?
> 
> Also, I am not sure if we changing this to check against system-zone's blocks
> (which has FS Metadata blocks from all block groups), can add any additional
> penalty?

I agree the check will be somewhat more costly (rbtree lookup). OTOH with
more complex fs structure (like flexbg which is default for quite some
time), this is by far not checking the only metadata blocks, that can
overlap the freed range. Also this is not checking for freeing journal
blocks. So I'd either got for no check (if we really want performance) or
full check (if we care more about detecting fs errors early). Because these
half-baked checks do not bring much value these days...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function
  2022-02-04 11:49       ` Jan Kara
@ 2022-02-05 10:43         ` Ritesh Harjani
  0 siblings, 0 replies; 18+ messages in thread
From: Ritesh Harjani @ 2022-02-05 10:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, linux-fsdevel, Theodore Ts'o, Harshad Shirwadkar

On 22/02/04 12:49PM, Jan Kara wrote:
> On Fri 04-02-22 15:38:44, Ritesh Harjani wrote:
> > On 22/02/01 12:34PM, Jan Kara wrote:
> > > On Mon 31-01-22 20:46:51, Ritesh Harjani wrote:
> > > > This patch implements ext4_group_block_valid() check functionality,
> > > > and refactors all the callers to use this common function instead.
> > > >
> > > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > > ...
> > >
> > > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> > > > index 8d23108cf9d7..60d32d3d8dc4 100644
> > > > --- a/fs/ext4/mballoc.c
> > > > +++ b/fs/ext4/mballoc.c
> > > > @@ -6001,13 +6001,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
> > > >  		goto error_return;
> > > >  	}
> > > >
> > > > -	if (in_range(ext4_block_bitmap(sb, gdp), block, count) ||
> > > > -	    in_range(ext4_inode_bitmap(sb, gdp), block, count) ||
> > > > -	    in_range(block, ext4_inode_table(sb, gdp),
> > > > -		     sbi->s_itb_per_group) ||
> > > > -	    in_range(block + count - 1, ext4_inode_table(sb, gdp),
> > > > -		     sbi->s_itb_per_group)) {
> > > > -
> > > > +	if (!ext4_group_block_valid(sb, block_group, block, count)) {
> > > >  		ext4_error(sb, "Freeing blocks in system zone - "
> > > >  			   "Block = %llu, count = %lu", block, count);
> > > >  		/* err = 0. ext4_std_error should be a no op */
> > >
> > > When doing this, why not rather directly use ext4_inode_block_valid() here?
> >
> > This is because while freeing these blocks we have their's corresponding block
> > group too. So there is little point in checking FS Metadata of all block groups
> > v/s FS Metadata of just this block group, no?
> >
> > Also, I am not sure if we changing this to check against system-zone's blocks
> > (which has FS Metadata blocks from all block groups), can add any additional
> > penalty?
>
> I agree the check will be somewhat more costly (rbtree lookup). OTOH with
> more complex fs structure (like flexbg which is default for quite some
> time), this is by far not checking the only metadata blocks, that can
> overlap the freed range. Also this is not checking for freeing journal
> blocks. So I'd either got for no check (if we really want performance) or
> full check (if we care more about detecting fs errors early). Because these
> half-baked checks do not bring much value these days...

Agreed. Thanks for putting out your points.
I am making these suggested changes to add stricter checking via
ext4_inode_block_valid() and will be sending out v1 soon.

-ritesh

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-02-05 10:44 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1643642105.git.riteshh@linux.ibm.com>
2022-01-31 15:16 ` [RFC 1/6] ext4: Fixes ext4_mb_mark_bb() with flex_bg with fast_commit Ritesh Harjani
2022-02-01 11:21   ` Jan Kara
2022-02-04 10:12     ` Ritesh Harjani
2022-01-31 15:16 ` [RFC 2/6] ext4: Implement ext4_group_block_valid() as common function Ritesh Harjani
2022-02-01 11:34   ` Jan Kara
2022-02-04 10:08     ` Ritesh Harjani
2022-02-04 11:49       ` Jan Kara
2022-02-05 10:43         ` Ritesh Harjani
2022-01-31 15:16 ` [RFC 3/6] ext4: Use in_range() for range checking in ext4_fc_replay_check_excluded Ritesh Harjani
2022-02-01 11:35   ` Jan Kara
2022-01-31 15:16 ` [RFC 4/6] ext4: No need to test for block bitmap bits in ext4_mb_mark_bb() Ritesh Harjani
2022-02-01 11:38   ` Jan Kara
2022-02-04 10:10     ` Ritesh Harjani
2022-01-31 15:16 ` [RFC 5/6] ext4: Refactor ext4_free_blocks() to pull out ext4_mb_clear_bb() Ritesh Harjani
2022-02-01 11:40   ` Jan Kara
2022-01-31 15:16 ` [RFC 6/6] ext4: Add extra check in ext4_mb_mark_bb() to prevent against possible corruption Ritesh Harjani
2022-02-01 11:47   ` Jan Kara
2022-02-04 10:11     ` Ritesh Harjani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.