[PATCH 0/4] ext4: Check journal inode extents more carefully

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] ext4: Check journal inode extents more carefully
@ 2020-07-15 13:18 Jan Kara
  2020-07-15 13:18 ` [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount Jan Kara
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jan Kara @ 2020-07-15 13:18 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ritesh Harjani, Wolfgang Frisch, Jan Kara

Hello!

This series changes ext4 to properly check extent tree blocks of journal inode.
Omitting these (which is a limitation of block validity checks) leads to crash
in ext4_cache_extents() in case the extent tree of the journal inode is
suitably corrupted. 

								Honza

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount
  2020-07-15 13:18 [PATCH 0/4] ext4: Check journal inode extents more carefully Jan Kara
@ 2020-07-15 13:18 ` Jan Kara
  2020-07-21 10:36   ` Lukas Czerner
  2020-07-15 13:18 ` [PATCH 2/4] ext4: Don't allow overlapping system zones Jan Kara
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2020-07-15 13:18 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ritesh Harjani, Wolfgang Frisch, Jan Kara

ext4_setup_system_zone() can fail. Handle the failure in ext4_remount().

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/super.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 330957ed1f05..8e055ec57a2c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5653,7 +5653,10 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 		ext4_register_li_request(sb, first_not_zeroed);
 	}
 
-	ext4_setup_system_zone(sb);
+	err = ext4_setup_system_zone(sb);
+	if (err)
+		goto restore_opts;
+
 	if (sbi->s_journal == NULL && !(old_sb_flags & SB_RDONLY)) {
 		err = ext4_commit_super(sb, 1);
 		if (err)
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] ext4: Don't allow overlapping system zones
  2020-07-15 13:18 [PATCH 0/4] ext4: Check journal inode extents more carefully Jan Kara
  2020-07-15 13:18 ` [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount Jan Kara
@ 2020-07-15 13:18 ` Jan Kara
  2020-07-21 10:36   ` Lukas Czerner
  2020-07-15 13:18 ` [PATCH 3/4] ext4: Check journal inode extents more carefully Jan Kara
  2020-07-15 13:18 ` [PATCH 4/4] ext4: Fold ext4_data_block_valid_rcu() into the caller Jan Kara
  3 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2020-07-15 13:18 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ritesh Harjani, Wolfgang Frisch, Jan Kara

Currently, add_system_zone() just silently merges two added system zones
that overlap. However the overlap should not happen and it generally
suggests that some unrelated metadata overlap which indicates the fs is
corrupted. We should have caught such problems earlier (e.g. in
ext4_check_descriptors()) but add this check as another line of defense.
In later patch we also use this for stricter checking of journal inode
extent tree.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/block_validity.c | 36 +++++++++++++-----------------------
 1 file changed, 13 insertions(+), 23 deletions(-)

diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
index 16e9b2fda03a..b394a50ebbe3 100644
--- a/fs/ext4/block_validity.c
+++ b/fs/ext4/block_validity.c
@@ -68,7 +68,7 @@ static int add_system_zone(struct ext4_system_blocks *system_blks,
 			   ext4_fsblk_t start_blk,
 			   unsigned int count)
 {
-	struct ext4_system_zone *new_entry = NULL, *entry;
+	struct ext4_system_zone *new_entry, *entry;
 	struct rb_node **n = &system_blks->root.rb_node, *node;
 	struct rb_node *parent = NULL, *new_node = NULL;
 
@@ -79,30 +79,20 @@ static int add_system_zone(struct ext4_system_blocks *system_blks,
 			n = &(*n)->rb_left;
 		else if (start_blk >= (entry->start_blk + entry->count))
 			n = &(*n)->rb_right;
-		else {
-			if (start_blk + count > (entry->start_blk +
-						 entry->count))
-				entry->count = (start_blk + count -
-						entry->start_blk);
-			new_node = *n;
-			new_entry = rb_entry(new_node, struct ext4_system_zone,
-					     node);
-			break;
-		}
+		else	/* Unexpected overlap of system zones. */
+			return -EFSCORRUPTED;
 	}
 
-	if (!new_entry) {
-		new_entry = kmem_cache_alloc(ext4_system_zone_cachep,
-					     GFP_KERNEL);
-		if (!new_entry)
-			return -ENOMEM;
-		new_entry->start_blk = start_blk;
-		new_entry->count = count;
-		new_node = &new_entry->node;
-
-		rb_link_node(new_node, parent, n);
-		rb_insert_color(new_node, &system_blks->root);
-	}
+	new_entry = kmem_cache_alloc(ext4_system_zone_cachep,
+				     GFP_KERNEL);
+	if (!new_entry)
+		return -ENOMEM;
+	new_entry->start_blk = start_blk;
+	new_entry->count = count;
+	new_node = &new_entry->node;
+
+	rb_link_node(new_node, parent, n);
+	rb_insert_color(new_node, &system_blks->root);
 
 	/* Can we merge to the left? */
 	node = rb_prev(new_node);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] ext4: Check journal inode extents more carefully
  2020-07-15 13:18 [PATCH 0/4] ext4: Check journal inode extents more carefully Jan Kara
  2020-07-15 13:18 ` [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount Jan Kara
  2020-07-15 13:18 ` [PATCH 2/4] ext4: Don't allow overlapping system zones Jan Kara
@ 2020-07-15 13:18 ` Jan Kara
  2020-07-21 10:38   ` Lukas Czerner
  2020-07-15 13:18 ` [PATCH 4/4] ext4: Fold ext4_data_block_valid_rcu() into the caller Jan Kara
  3 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2020-07-15 13:18 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ritesh Harjani, Wolfgang Frisch, Jan Kara

Currently, system zones just track ranges of block, that are "important"
fs metadata (bitmaps, group descriptors, journal blocks, etc.). This
however complicates how extent tree (or indirect blocks) can be checked
for inodes that actually track such metadata - currently the journal
inode but arguably we should be treating quota files or resize inode
similarly. We cannot run __ext4_ext_check() on such metadata inodes when
loading their extents as that would immediately trigger the validity
checks and so we just hack around that and special-case the journal
inode. This however leads to a situation that a journal inode which has
extent tree of depth at least one can have invalid extent tree that gets
unnoticed until ext4_cache_extents() crashes.

To overcome this limitation, track inode number each system zone belongs
to (0 is used for zones not belonging to any inode). We can then verify
inode number matches the expected one when verifying extent tree and
thus avoid the false errors. With this there's no need to to
special-case journal inode during extent tree checking anymore so remove
it.

Fixes: 0a944e8a6c66 ("ext4: don't perform block validity checks on the journal inode")
Reported-by: Wolfgang Frisch <wolfgang.frisch@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/block_validity.c | 49 ++++++++++++++++++++++++------------------------
 fs/ext4/ext4.h           |  6 +++---
 fs/ext4/extents.c        | 16 ++++++----------
 fs/ext4/indirect.c       |  6 ++----
 fs/ext4/inode.c          |  5 ++---
 fs/ext4/mballoc.c        |  4 ++--
 6 files changed, 40 insertions(+), 46 deletions(-)

diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
index b394a50ebbe3..3602356cbf09 100644
--- a/fs/ext4/block_validity.c
+++ b/fs/ext4/block_validity.c
@@ -24,6 +24,7 @@ struct ext4_system_zone {
 	struct rb_node	node;
 	ext4_fsblk_t	start_blk;
 	unsigned int	count;
+	u32		ino;
 };
 
 static struct kmem_cache *ext4_system_zone_cachep;
@@ -45,7 +46,8 @@ void ext4_exit_system_zone(void)
 static inline int can_merge(struct ext4_system_zone *entry1,
 		     struct ext4_system_zone *entry2)
 {
-	if ((entry1->start_blk + entry1->count) == entry2->start_blk)
+	if ((entry1->start_blk + entry1->count) == entry2->start_blk &&
+	    entry1->ino == entry2->ino)
 		return 1;
 	return 0;
 }
@@ -66,7 +68,7 @@ static void release_system_zone(struct ext4_system_blocks *system_blks)
  */
 static int add_system_zone(struct ext4_system_blocks *system_blks,
 			   ext4_fsblk_t start_blk,
-			   unsigned int count)
+			   unsigned int count, u32 ino)
 {
 	struct ext4_system_zone *new_entry, *entry;
 	struct rb_node **n = &system_blks->root.rb_node, *node;
@@ -89,6 +91,7 @@ static int add_system_zone(struct ext4_system_blocks *system_blks,
 		return -ENOMEM;
 	new_entry->start_blk = start_blk;
 	new_entry->count = count;
+	new_entry->ino = ino;
 	new_node = &new_entry->node;
 
 	rb_link_node(new_node, parent, n);
@@ -149,7 +152,7 @@ static void debug_print_tree(struct ext4_sb_info *sbi)
 static int ext4_data_block_valid_rcu(struct ext4_sb_info *sbi,
 				     struct ext4_system_blocks *system_blks,
 				     ext4_fsblk_t start_blk,
-				     unsigned int count)
+				     unsigned int count, ino_t ino)
 {
 	struct ext4_system_zone *entry;
 	struct rb_node *n;
@@ -170,7 +173,7 @@ static int ext4_data_block_valid_rcu(struct ext4_sb_info *sbi,
 		else if (start_blk >= (entry->start_blk + entry->count))
 			n = n->rb_right;
 		else
-			return 0;
+			return entry->ino == ino;
 	}
 	return 1;
 }
@@ -204,19 +207,18 @@ static int ext4_protect_reserved_inode(struct super_block *sb,
 		if (n == 0) {
 			i++;
 		} else {
-			if (!ext4_data_block_valid_rcu(sbi, system_blks,
-						map.m_pblk, n)) {
-				err = -EFSCORRUPTED;
-				__ext4_error(sb, __func__, __LINE__, -err,
-					     map.m_pblk, "blocks %llu-%llu "
-					     "from inode %u overlap system zone",
-					     map.m_pblk,
-					     map.m_pblk + map.m_len - 1, ino);
+			err = add_system_zone(system_blks, map.m_pblk, n, ino);
+			if (err < 0) {
+				if (err == -EFSCORRUPTED) {
+					__ext4_error(sb, __func__, __LINE__,
+						     -err, map.m_pblk,
+						     "blocks %llu-%llu from inode %u overlap system zone",
+						     map.m_pblk,
+						     map.m_pblk + map.m_len - 1,
+						     ino);
+				}
 				break;
 			}
-			err = add_system_zone(system_blks, map.m_pblk, n);
-			if (err < 0)
-				break;
 			i += n;
 		}
 	}
@@ -270,19 +272,19 @@ int ext4_setup_system_zone(struct super_block *sb)
 		    ((i < 5) || ((i % flex_size) == 0)))
 			add_system_zone(system_blks,
 					ext4_group_first_block_no(sb, i),
-					ext4_bg_num_gdb(sb, i) + 1);
+					ext4_bg_num_gdb(sb, i) + 1, 0);
 		gdp = ext4_get_group_desc(sb, i, NULL);
 		ret = add_system_zone(system_blks,
-				ext4_block_bitmap(sb, gdp), 1);
+				ext4_block_bitmap(sb, gdp), 1, 0);
 		if (ret)
 			goto err;
 		ret = add_system_zone(system_blks,
-				ext4_inode_bitmap(sb, gdp), 1);
+				ext4_inode_bitmap(sb, gdp), 1, 0);
 		if (ret)
 			goto err;
 		ret = add_system_zone(system_blks,
 				ext4_inode_table(sb, gdp),
-				sbi->s_itb_per_group);
+				sbi->s_itb_per_group, 0);
 		if (ret)
 			goto err;
 	}
@@ -331,7 +333,7 @@ void ext4_release_system_zone(struct super_block *sb)
 		call_rcu(&system_blks->rcu, ext4_destroy_system_zone);
 }
 
-int ext4_data_block_valid(struct ext4_sb_info *sbi, ext4_fsblk_t start_blk,
+int ext4_inode_block_valid(struct inode *inode, ext4_fsblk_t start_blk,
 			  unsigned int count)
 {
 	struct ext4_system_blocks *system_blks;
@@ -344,8 +346,8 @@ int ext4_data_block_valid(struct ext4_sb_info *sbi, ext4_fsblk_t start_blk,
 	 */
 	rcu_read_lock();
 	system_blks = rcu_dereference(sbi->system_blks);
-	ret = ext4_data_block_valid_rcu(sbi, system_blks, start_blk,
-					count);
+	ret = ext4_data_block_valid_rcu(EXT4_SB(inode->i_sb), system_blks,
+					start_blk, count, inode->i_ino);
 	rcu_read_unlock();
 	return ret;
 }
@@ -364,8 +366,7 @@ int ext4_check_blockref(const char *function, unsigned int line,
 	while (bref < p+max) {
 		blk = le32_to_cpu(*bref++);
 		if (blk &&
-		    unlikely(!ext4_data_block_valid(EXT4_SB(inode->i_sb),
-						    blk, 1))) {
+		    unlikely(!ext4_inode_block_valid(inode, blk, 1))) {
 			ext4_error_inode(inode, function, line, blk,
 					 "invalid block");
 			return -EFSCORRUPTED;
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 42f5060f3cdf..42815304902b 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3363,9 +3363,9 @@ extern void ext4_release_system_zone(struct super_block *sb);
 extern int ext4_setup_system_zone(struct super_block *sb);
 extern int __init ext4_init_system_zone(void);
 extern void ext4_exit_system_zone(void);
-extern int ext4_data_block_valid(struct ext4_sb_info *sbi,
-				 ext4_fsblk_t start_blk,
-				 unsigned int count);
+extern int ext4_inode_block_valid(struct inode *inode,
+				  ext4_fsblk_t start_blk,
+				  unsigned int count);
 extern int ext4_check_blockref(const char *, unsigned int,
 			       struct inode *, __le32 *, unsigned int);
 
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 221f240eae60..d75054570e44 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -340,7 +340,7 @@ static int ext4_valid_extent(struct inode *inode, struct ext4_extent *ext)
 	 */
 	if (lblock + len <= lblock)
 		return 0;
-	return ext4_data_block_valid(EXT4_SB(inode->i_sb), block, len);
+	return ext4_inode_block_valid(inode, block, len);
 }
 
 static int ext4_valid_extent_idx(struct inode *inode,
@@ -348,7 +348,7 @@ static int ext4_valid_extent_idx(struct inode *inode,
 {
 	ext4_fsblk_t block = ext4_idx_pblock(ext_idx);
 
-	return ext4_data_block_valid(EXT4_SB(inode->i_sb), block, 1);
+	return ext4_inode_block_valid(inode, block, 1);
 }
 
 static int ext4_valid_extent_entries(struct inode *inode,
@@ -507,14 +507,10 @@ __read_extent_tree_block(const char *function, unsigned int line,
 	}
 	if (buffer_verified(bh) && !(flags & EXT4_EX_FORCE_CACHE))
 		return bh;
-	if (!ext4_has_feature_journal(inode->i_sb) ||
-	    (inode->i_ino !=
-	     le32_to_cpu(EXT4_SB(inode->i_sb)->s_es->s_journal_inum))) {
-		err = __ext4_ext_check(function, line, inode,
-				       ext_block_hdr(bh), depth, pblk);
-		if (err)
-			goto errout;
-	}
+	err = __ext4_ext_check(function, line, inode,
+			       ext_block_hdr(bh), depth, pblk);
+	if (err)
+		goto errout;
 	set_buffer_verified(bh);
 	/*
 	 * If this is a leaf block, cache all of its entries
diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index be2b66eb65f7..402641825712 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -858,8 +858,7 @@ static int ext4_clear_blocks(handle_t *handle, struct inode *inode,
 	else if (ext4_should_journal_data(inode))
 		flags |= EXT4_FREE_BLOCKS_FORGET;
 
-	if (!ext4_data_block_valid(EXT4_SB(inode->i_sb), block_to_free,
-				   count)) {
+	if (!ext4_inode_block_valid(inode, block_to_free, count)) {
 		EXT4_ERROR_INODE(inode, "attempt to clear invalid "
 				 "blocks %llu len %lu",
 				 (unsigned long long) block_to_free, count);
@@ -1004,8 +1003,7 @@ static void ext4_free_branches(handle_t *handle, struct inode *inode,
 			if (!nr)
 				continue;		/* A hole */
 
-			if (!ext4_data_block_valid(EXT4_SB(inode->i_sb),
-						   nr, 1)) {
+			if (!ext4_inode_block_valid(inode, nr, 1)) {
 				EXT4_ERROR_INODE(inode,
 						 "invalid indirect mapped "
 						 "block %lu (level %d)",
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 10dd470876b3..92573f8540ab 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -394,8 +394,7 @@ static int __check_block_validity(struct inode *inode, const char *func,
 	    (inode->i_ino ==
 	     le32_to_cpu(EXT4_SB(inode->i_sb)->s_es->s_journal_inum)))
 		return 0;
-	if (!ext4_data_block_valid(EXT4_SB(inode->i_sb), map->m_pblk,
-				   map->m_len)) {
+	if (!ext4_inode_block_valid(inode, map->m_pblk, map->m_len)) {
 		ext4_error_inode(inode, func, line, map->m_pblk,
 				 "lblock %lu mapped to illegal pblock %llu "
 				 "(length %d)", (unsigned long) map->m_lblk,
@@ -4760,7 +4759,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 
 	ret = 0;
 	if (ei->i_file_acl &&
-	    !ext4_data_block_valid(EXT4_SB(sb), ei->i_file_acl, 1)) {
+	    !ext4_inode_block_valid(inode, ei->i_file_acl, 1)) {
 		ext4_error_inode(inode, function, line, 0,
 				 "iget: bad extended attribute block %llu",
 				 ei->i_file_acl);
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c0a331e2feb0..38719c156573 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3090,7 +3090,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
 	block = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
 
 	len = EXT4_C2B(sbi, ac->ac_b_ex.fe_len);
-	if (!ext4_data_block_valid(sbi, block, len)) {
+	if (!ext4_inode_block_valid(ac->ac_inode, block, len)) {
 		ext4_error(sb, "Allocating blocks %llu-%llu which overlap "
 			   "fs metadata", block, block+len);
 		/* File system mounted not to panic on error
@@ -4915,7 +4915,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
 
 	sbi = EXT4_SB(sb);
 	if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) &&
-	    !ext4_data_block_valid(sbi, block, count)) {
+	    !ext4_inode_block_valid(inode, block, count)) {
 		ext4_error(sb, "Freeing blocks not in datazone - "
 			   "block = %llu, count = %lu", block, count);
 		goto error_return;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] ext4: Fold ext4_data_block_valid_rcu() into the caller
  2020-07-15 13:18 [PATCH 0/4] ext4: Check journal inode extents more carefully Jan Kara
                   ` (2 preceding siblings ...)
  2020-07-15 13:18 ` [PATCH 3/4] ext4: Check journal inode extents more carefully Jan Kara
@ 2020-07-15 13:18 ` Jan Kara
  2020-07-21 10:39   ` Lukas Czerner
  3 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2020-07-15 13:18 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ritesh Harjani, Wolfgang Frisch, Jan Kara

After the previous patch, ext4_data_block_valid_rcu() has a single
caller. Fold it into it.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/block_validity.c | 67 ++++++++++++++++++++++--------------------------
 1 file changed, 30 insertions(+), 37 deletions(-)

diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
index 3602356cbf09..9c40214f31f9 100644
--- a/fs/ext4/block_validity.c
+++ b/fs/ext4/block_validity.c
@@ -144,40 +144,6 @@ static void debug_print_tree(struct ext4_sb_info *sbi)
 	printk(KERN_CONT "\n");
 }
 
-/*
- * Returns 1 if the passed-in block region (start_blk,
- * start_blk+count) is valid; 0 if some part of the block region
- * overlaps with filesystem metadata blocks.
- */
-static int ext4_data_block_valid_rcu(struct ext4_sb_info *sbi,
-				     struct ext4_system_blocks *system_blks,
-				     ext4_fsblk_t start_blk,
-				     unsigned int count, ino_t ino)
-{
-	struct ext4_system_zone *entry;
-	struct rb_node *n;
-
-	if ((start_blk <= le32_to_cpu(sbi->s_es->s_first_data_block)) ||
-	    (start_blk + count < start_blk) ||
-	    (start_blk + count > ext4_blocks_count(sbi->s_es)))
-		return 0;
-
-	if (system_blks == NULL)
-		return 1;
-
-	n = system_blks->root.rb_node;
-	while (n) {
-		entry = rb_entry(n, struct ext4_system_zone, node);
-		if (start_blk + count - 1 < entry->start_blk)
-			n = n->rb_left;
-		else if (start_blk >= (entry->start_blk + entry->count))
-			n = n->rb_right;
-		else
-			return entry->ino == ino;
-	}
-	return 1;
-}
-
 static int ext4_protect_reserved_inode(struct super_block *sb,
 				       struct ext4_system_blocks *system_blks,
 				       u32 ino)
@@ -333,11 +299,24 @@ void ext4_release_system_zone(struct super_block *sb)
 		call_rcu(&system_blks->rcu, ext4_destroy_system_zone);
 }
 
+/*
+ * Returns 1 if the passed-in block region (start_blk,
+ * start_blk+count) is valid; 0 if some part of the block region
+ * overlaps with some other filesystem metadata blocks.
+ */
 int ext4_inode_block_valid(struct inode *inode, ext4_fsblk_t start_blk,
 			  unsigned int count)
 {
+	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	struct ext4_system_blocks *system_blks;
-	int ret;
+	struct ext4_system_zone *entry;
+	struct rb_node *n;
+	int ret = 1;
+
+	if ((start_blk <= le32_to_cpu(sbi->s_es->s_first_data_block)) ||
+	    (start_blk + count < start_blk) ||
+	    (start_blk + count > ext4_blocks_count(sbi->s_es)))
+		return 0;
 
 	/*
 	 * Lock the system zone to prevent it being released concurrently
@@ -346,8 +325,22 @@ int ext4_inode_block_valid(struct inode *inode, ext4_fsblk_t start_blk,
 	 */
 	rcu_read_lock();
 	system_blks = rcu_dereference(sbi->system_blks);
-	ret = ext4_data_block_valid_rcu(EXT4_SB(inode->i_sb), system_blks,
-					start_blk, count, inode->i_ino);
+	if (system_blks == NULL)
+		goto out_rcu;
+
+	n = system_blks->root.rb_node;
+	while (n) {
+		entry = rb_entry(n, struct ext4_system_zone, node);
+		if (start_blk + count - 1 < entry->start_blk)
+			n = n->rb_left;
+		else if (start_blk >= (entry->start_blk + entry->count))
+			n = n->rb_right;
+		else {
+			ret = (entry->ino == inode->i_ino);
+			break;
+		}
+	}
+out_rcu:
 	rcu_read_unlock();
 	return ret;
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount
  2020-07-15 13:18 ` [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount Jan Kara
@ 2020-07-21 10:36   ` Lukas Czerner
  2020-07-27 11:02     ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Lukas Czerner @ 2020-07-21 10:36 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4, Ritesh Harjani, Wolfgang Frisch

On Wed, Jul 15, 2020 at 03:18:09PM +0200, Jan Kara wrote:
> ext4_setup_system_zone() can fail. Handle the failure in ext4_remount().
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext4/super.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 330957ed1f05..8e055ec57a2c 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -5653,7 +5653,10 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
>  		ext4_register_li_request(sb, first_not_zeroed);
>  	}
>  
> -	ext4_setup_system_zone(sb);
> +	err = ext4_setup_system_zone(sb);
> +	if (err)
> +		goto restore_opts;
> +

Thanks Jan, this looks good. But while you're at it, ext4_remount is
missing ext4_release_system_zone() and so it we want to enable block_validity
on remount and it fails after ext4_setup_system_zone() we wont release
it. This *I think* means that we would end up with block_validity
enabled without user knowing about it ?

-Lukas

>  	if (sbi->s_journal == NULL && !(old_sb_flags & SB_RDONLY)) {
>  		err = ext4_commit_super(sb, 1);
>  		if (err)
> -- 
> 2.16.4
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] ext4: Don't allow overlapping system zones
  2020-07-15 13:18 ` [PATCH 2/4] ext4: Don't allow overlapping system zones Jan Kara
@ 2020-07-21 10:36   ` Lukas Czerner
  0 siblings, 0 replies; 12+ messages in thread
From: Lukas Czerner @ 2020-07-21 10:36 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4, Ritesh Harjani, Wolfgang Frisch

On Wed, Jul 15, 2020 at 03:18:10PM +0200, Jan Kara wrote:
> Currently, add_system_zone() just silently merges two added system zones
> that overlap. However the overlap should not happen and it generally
> suggests that some unrelated metadata overlap which indicates the fs is
> corrupted. We should have caught such problems earlier (e.g. in
> ext4_check_descriptors()) but add this check as another line of defense.
> In later patch we also use this for stricter checking of journal inode
> extent tree.

Looks good, thanks!

Reviewed-by: Lukas Czerner <lczerner@redhat.com>


> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext4/block_validity.c | 36 +++++++++++++-----------------------
>  1 file changed, 13 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
> index 16e9b2fda03a..b394a50ebbe3 100644
> --- a/fs/ext4/block_validity.c
> +++ b/fs/ext4/block_validity.c
> @@ -68,7 +68,7 @@ static int add_system_zone(struct ext4_system_blocks *system_blks,
>  			   ext4_fsblk_t start_blk,
>  			   unsigned int count)
>  {
> -	struct ext4_system_zone *new_entry = NULL, *entry;
> +	struct ext4_system_zone *new_entry, *entry;
>  	struct rb_node **n = &system_blks->root.rb_node, *node;
>  	struct rb_node *parent = NULL, *new_node = NULL;
>  
> @@ -79,30 +79,20 @@ static int add_system_zone(struct ext4_system_blocks *system_blks,
>  			n = &(*n)->rb_left;
>  		else if (start_blk >= (entry->start_blk + entry->count))
>  			n = &(*n)->rb_right;
> -		else {
> -			if (start_blk + count > (entry->start_blk +
> -						 entry->count))
> -				entry->count = (start_blk + count -
> -						entry->start_blk);
> -			new_node = *n;
> -			new_entry = rb_entry(new_node, struct ext4_system_zone,
> -					     node);
> -			break;
> -		}
> +		else	/* Unexpected overlap of system zones. */
> +			return -EFSCORRUPTED;
>  	}
>  
> -	if (!new_entry) {
> -		new_entry = kmem_cache_alloc(ext4_system_zone_cachep,
> -					     GFP_KERNEL);
> -		if (!new_entry)
> -			return -ENOMEM;
> -		new_entry->start_blk = start_blk;
> -		new_entry->count = count;
> -		new_node = &new_entry->node;
> -
> -		rb_link_node(new_node, parent, n);
> -		rb_insert_color(new_node, &system_blks->root);
> -	}
> +	new_entry = kmem_cache_alloc(ext4_system_zone_cachep,
> +				     GFP_KERNEL);
> +	if (!new_entry)
> +		return -ENOMEM;
> +	new_entry->start_blk = start_blk;
> +	new_entry->count = count;
> +	new_node = &new_entry->node;
> +
> +	rb_link_node(new_node, parent, n);
> +	rb_insert_color(new_node, &system_blks->root);
>  
>  	/* Can we merge to the left? */
>  	node = rb_prev(new_node);
> -- 
> 2.16.4
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] ext4: Check journal inode extents more carefully
  2020-07-15 13:18 ` [PATCH 3/4] ext4: Check journal inode extents more carefully Jan Kara
@ 2020-07-21 10:38   ` Lukas Czerner
  2020-07-27 10:59     ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Lukas Czerner @ 2020-07-21 10:38 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4, Ritesh Harjani, Wolfgang Frisch

On Wed, Jul 15, 2020 at 03:18:11PM +0200, Jan Kara wrote:
> Currently, system zones just track ranges of block, that are "important"
> fs metadata (bitmaps, group descriptors, journal blocks, etc.). This
> however complicates how extent tree (or indirect blocks) can be checked
> for inodes that actually track such metadata - currently the journal
> inode but arguably we should be treating quota files or resize inode
> similarly. We cannot run __ext4_ext_check() on such metadata inodes when
> loading their extents as that would immediately trigger the validity
> checks and so we just hack around that and special-case the journal
> inode. This however leads to a situation that a journal inode which has
> extent tree of depth at least one can have invalid extent tree that gets
> unnoticed until ext4_cache_extents() crashes.
> 
> To overcome this limitation, track inode number each system zone belongs
> to (0 is used for zones not belonging to any inode). We can then verify
> inode number matches the expected one when verifying extent tree and
> thus avoid the false errors. With this there's no need to to
> special-case journal inode during extent tree checking anymore so remove
> it.
> 
> Fixes: 0a944e8a6c66 ("ext4: don't perform block validity checks on the journal inode")
> Reported-by: Wolfgang Frisch <wolfgang.frisch@suse.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext4/block_validity.c | 49 ++++++++++++++++++++++++------------------------
>  fs/ext4/ext4.h           |  6 +++---
>  fs/ext4/extents.c        | 16 ++++++----------
>  fs/ext4/indirect.c       |  6 ++----
>  fs/ext4/inode.c          |  5 ++---
>  fs/ext4/mballoc.c        |  4 ++--
>  6 files changed, 40 insertions(+), 46 deletions(-)
> 
> diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
> index b394a50ebbe3..3602356cbf09 100644
> --- a/fs/ext4/block_validity.c
> +++ b/fs/ext4/block_validity.c
> @@ -24,6 +24,7 @@ struct ext4_system_zone {
>  	struct rb_node	node;
>  	ext4_fsblk_t	start_blk;
>  	unsigned int	count;
> +	u32		ino;
>  };
>  
>  static struct kmem_cache *ext4_system_zone_cachep;
> @@ -45,7 +46,8 @@ void ext4_exit_system_zone(void)
>  static inline int can_merge(struct ext4_system_zone *entry1,
>  		     struct ext4_system_zone *entry2)
>  {
> -	if ((entry1->start_blk + entry1->count) == entry2->start_blk)
> +	if ((entry1->start_blk + entry1->count) == entry2->start_blk &&
> +	    entry1->ino == entry2->ino)
>  		return 1;
>  	return 0;
>  }
> @@ -66,7 +68,7 @@ static void release_system_zone(struct ext4_system_blocks *system_blks)
>   */
>  static int add_system_zone(struct ext4_system_blocks *system_blks,
>  			   ext4_fsblk_t start_blk,
> -			   unsigned int count)
> +			   unsigned int count, u32 ino)
>  {
>  	struct ext4_system_zone *new_entry, *entry;
>  	struct rb_node **n = &system_blks->root.rb_node, *node;
> @@ -89,6 +91,7 @@ static int add_system_zone(struct ext4_system_blocks *system_blks,
>  		return -ENOMEM;
>  	new_entry->start_blk = start_blk;
>  	new_entry->count = count;
> +	new_entry->ino = ino;
>  	new_node = &new_entry->node;
>  
>  	rb_link_node(new_node, parent, n);
> @@ -149,7 +152,7 @@ static void debug_print_tree(struct ext4_sb_info *sbi)
>  static int ext4_data_block_valid_rcu(struct ext4_sb_info *sbi,
>  				     struct ext4_system_blocks *system_blks,
>  				     ext4_fsblk_t start_blk,
> -				     unsigned int count)
> +				     unsigned int count, ino_t ino)
>  {
>  	struct ext4_system_zone *entry;
>  	struct rb_node *n;
> @@ -170,7 +173,7 @@ static int ext4_data_block_valid_rcu(struct ext4_sb_info *sbi,
>  		else if (start_blk >= (entry->start_blk + entry->count))
>  			n = n->rb_right;
>  		else
> -			return 0;
> +			return entry->ino == ino;
>  	}
>  	return 1;
>  }
> @@ -204,19 +207,18 @@ static int ext4_protect_reserved_inode(struct super_block *sb,
>  		if (n == 0) {
>  			i++;
>  		} else {
> -			if (!ext4_data_block_valid_rcu(sbi, system_blks,
> -						map.m_pblk, n)) {
> -				err = -EFSCORRUPTED;
> -				__ext4_error(sb, __func__, __LINE__, -err,
> -					     map.m_pblk, "blocks %llu-%llu "
> -					     "from inode %u overlap system zone",
> -					     map.m_pblk,
> -					     map.m_pblk + map.m_len - 1, ino);
> +			err = add_system_zone(system_blks, map.m_pblk, n, ino);
> +			if (err < 0) {
> +				if (err == -EFSCORRUPTED) {
> +					__ext4_error(sb, __func__, __LINE__,
> +						     -err, map.m_pblk,
> +						     "blocks %llu-%llu from inode %u overlap system zone",
> +						     map.m_pblk,
> +						     map.m_pblk + map.m_len - 1,
> +						     ino);
> +				}
>  				break;
>  			}
> -			err = add_system_zone(system_blks, map.m_pblk, n);
> -			if (err < 0)
> -				break;
>  			i += n;
>  		}
>  	}
> @@ -270,19 +272,19 @@ int ext4_setup_system_zone(struct super_block *sb)
>  		    ((i < 5) || ((i % flex_size) == 0)))
>  			add_system_zone(system_blks,
>  					ext4_group_first_block_no(sb, i),
> -					ext4_bg_num_gdb(sb, i) + 1);
> +					ext4_bg_num_gdb(sb, i) + 1, 0);

Is there a good reason we don't check the return value, it can still
fail right ?

Other than that the patch looks good to me.

Reviewed-by: Lukas Czerner <lczerner@redhat.com>

-Lukas

>  		gdp = ext4_get_group_desc(sb, i, NULL);
>  		ret = add_system_zone(system_blks,
> -				ext4_block_bitmap(sb, gdp), 1);
> +				ext4_block_bitmap(sb, gdp), 1, 0);
>  		if (ret)
>  			goto err;
>  		ret = add_system_zone(system_blks,
> -				ext4_inode_bitmap(sb, gdp), 1);
> +				ext4_inode_bitmap(sb, gdp), 1, 0);
>  		if (ret)
>  			goto err;
>  		ret = add_system_zone(system_blks,
>  				ext4_inode_table(sb, gdp),
> -				sbi->s_itb_per_group);
> +				sbi->s_itb_per_group, 0);
>  		if (ret)
>  			goto err;
>  	}
> @@ -331,7 +333,7 @@ void ext4_release_system_zone(struct super_block *sb)
>  		call_rcu(&system_blks->rcu, ext4_destroy_system_zone);
>  }
>  
> -int ext4_data_block_valid(struct ext4_sb_info *sbi, ext4_fsblk_t start_blk,
> +int ext4_inode_block_valid(struct inode *inode, ext4_fsblk_t start_blk,
>  			  unsigned int count)
>  {
>  	struct ext4_system_blocks *system_blks;
> @@ -344,8 +346,8 @@ int ext4_data_block_valid(struct ext4_sb_info *sbi, ext4_fsblk_t start_blk,
>  	 */
>  	rcu_read_lock();
>  	system_blks = rcu_dereference(sbi->system_blks);
> -	ret = ext4_data_block_valid_rcu(sbi, system_blks, start_blk,
> -					count);
> +	ret = ext4_data_block_valid_rcu(EXT4_SB(inode->i_sb), system_blks,
> +					start_blk, count, inode->i_ino);
>  	rcu_read_unlock();
>  	return ret;
>  }
> @@ -364,8 +366,7 @@ int ext4_check_blockref(const char *function, unsigned int line,
>  	while (bref < p+max) {
>  		blk = le32_to_cpu(*bref++);
>  		if (blk &&
> -		    unlikely(!ext4_data_block_valid(EXT4_SB(inode->i_sb),
> -						    blk, 1))) {
> +		    unlikely(!ext4_inode_block_valid(inode, blk, 1))) {
>  			ext4_error_inode(inode, function, line, blk,
>  					 "invalid block");
>  			return -EFSCORRUPTED;
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 42f5060f3cdf..42815304902b 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -3363,9 +3363,9 @@ extern void ext4_release_system_zone(struct super_block *sb);
>  extern int ext4_setup_system_zone(struct super_block *sb);
>  extern int __init ext4_init_system_zone(void);
>  extern void ext4_exit_system_zone(void);
> -extern int ext4_data_block_valid(struct ext4_sb_info *sbi,
> -				 ext4_fsblk_t start_blk,
> -				 unsigned int count);
> +extern int ext4_inode_block_valid(struct inode *inode,
> +				  ext4_fsblk_t start_blk,
> +				  unsigned int count);
>  extern int ext4_check_blockref(const char *, unsigned int,
>  			       struct inode *, __le32 *, unsigned int);
>  
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 221f240eae60..d75054570e44 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -340,7 +340,7 @@ static int ext4_valid_extent(struct inode *inode, struct ext4_extent *ext)
>  	 */
>  	if (lblock + len <= lblock)
>  		return 0;
> -	return ext4_data_block_valid(EXT4_SB(inode->i_sb), block, len);
> +	return ext4_inode_block_valid(inode, block, len);
>  }
>  
>  static int ext4_valid_extent_idx(struct inode *inode,
> @@ -348,7 +348,7 @@ static int ext4_valid_extent_idx(struct inode *inode,
>  {
>  	ext4_fsblk_t block = ext4_idx_pblock(ext_idx);
>  
> -	return ext4_data_block_valid(EXT4_SB(inode->i_sb), block, 1);
> +	return ext4_inode_block_valid(inode, block, 1);
>  }
>  
>  static int ext4_valid_extent_entries(struct inode *inode,
> @@ -507,14 +507,10 @@ __read_extent_tree_block(const char *function, unsigned int line,
>  	}
>  	if (buffer_verified(bh) && !(flags & EXT4_EX_FORCE_CACHE))
>  		return bh;
> -	if (!ext4_has_feature_journal(inode->i_sb) ||
> -	    (inode->i_ino !=
> -	     le32_to_cpu(EXT4_SB(inode->i_sb)->s_es->s_journal_inum))) {
> -		err = __ext4_ext_check(function, line, inode,
> -				       ext_block_hdr(bh), depth, pblk);
> -		if (err)
> -			goto errout;
> -	}
> +	err = __ext4_ext_check(function, line, inode,
> +			       ext_block_hdr(bh), depth, pblk);
> +	if (err)
> +		goto errout;
>  	set_buffer_verified(bh);
>  	/*
>  	 * If this is a leaf block, cache all of its entries
> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
> index be2b66eb65f7..402641825712 100644
> --- a/fs/ext4/indirect.c
> +++ b/fs/ext4/indirect.c
> @@ -858,8 +858,7 @@ static int ext4_clear_blocks(handle_t *handle, struct inode *inode,
>  	else if (ext4_should_journal_data(inode))
>  		flags |= EXT4_FREE_BLOCKS_FORGET;
>  
> -	if (!ext4_data_block_valid(EXT4_SB(inode->i_sb), block_to_free,
> -				   count)) {
> +	if (!ext4_inode_block_valid(inode, block_to_free, count)) {
>  		EXT4_ERROR_INODE(inode, "attempt to clear invalid "
>  				 "blocks %llu len %lu",
>  				 (unsigned long long) block_to_free, count);
> @@ -1004,8 +1003,7 @@ static void ext4_free_branches(handle_t *handle, struct inode *inode,
>  			if (!nr)
>  				continue;		/* A hole */
>  
> -			if (!ext4_data_block_valid(EXT4_SB(inode->i_sb),
> -						   nr, 1)) {
> +			if (!ext4_inode_block_valid(inode, nr, 1)) {
>  				EXT4_ERROR_INODE(inode,
>  						 "invalid indirect mapped "
>  						 "block %lu (level %d)",
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 10dd470876b3..92573f8540ab 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -394,8 +394,7 @@ static int __check_block_validity(struct inode *inode, const char *func,
>  	    (inode->i_ino ==
>  	     le32_to_cpu(EXT4_SB(inode->i_sb)->s_es->s_journal_inum)))
>  		return 0;
> -	if (!ext4_data_block_valid(EXT4_SB(inode->i_sb), map->m_pblk,
> -				   map->m_len)) {
> +	if (!ext4_inode_block_valid(inode, map->m_pblk, map->m_len)) {
>  		ext4_error_inode(inode, func, line, map->m_pblk,
>  				 "lblock %lu mapped to illegal pblock %llu "
>  				 "(length %d)", (unsigned long) map->m_lblk,
> @@ -4760,7 +4759,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
>  
>  	ret = 0;
>  	if (ei->i_file_acl &&
> -	    !ext4_data_block_valid(EXT4_SB(sb), ei->i_file_acl, 1)) {
> +	    !ext4_inode_block_valid(inode, ei->i_file_acl, 1)) {
>  		ext4_error_inode(inode, function, line, 0,
>  				 "iget: bad extended attribute block %llu",
>  				 ei->i_file_acl);
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index c0a331e2feb0..38719c156573 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3090,7 +3090,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
>  	block = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
>  
>  	len = EXT4_C2B(sbi, ac->ac_b_ex.fe_len);
> -	if (!ext4_data_block_valid(sbi, block, len)) {
> +	if (!ext4_inode_block_valid(ac->ac_inode, block, len)) {
>  		ext4_error(sb, "Allocating blocks %llu-%llu which overlap "
>  			   "fs metadata", block, block+len);
>  		/* File system mounted not to panic on error
> @@ -4915,7 +4915,7 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
>  
>  	sbi = EXT4_SB(sb);
>  	if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) &&
> -	    !ext4_data_block_valid(sbi, block, count)) {
> +	    !ext4_inode_block_valid(inode, block, count)) {
>  		ext4_error(sb, "Freeing blocks not in datazone - "
>  			   "block = %llu, count = %lu", block, count);
>  		goto error_return;
> -- 
> 2.16.4
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] ext4: Fold ext4_data_block_valid_rcu() into the caller
  2020-07-15 13:18 ` [PATCH 4/4] ext4: Fold ext4_data_block_valid_rcu() into the caller Jan Kara
@ 2020-07-21 10:39   ` Lukas Czerner
  0 siblings, 0 replies; 12+ messages in thread
From: Lukas Czerner @ 2020-07-21 10:39 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4, Ritesh Harjani, Wolfgang Frisch

On Wed, Jul 15, 2020 at 03:18:12PM +0200, Jan Kara wrote:
> After the previous patch, ext4_data_block_valid_rcu() has a single
> caller. Fold it into it.

Looks good, thanks!

Reviewed-by: Lukas Czerner <lczerner@redhat.com>

> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext4/block_validity.c | 67 ++++++++++++++++++++++--------------------------
>  1 file changed, 30 insertions(+), 37 deletions(-)
> 
> diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
> index 3602356cbf09..9c40214f31f9 100644
> --- a/fs/ext4/block_validity.c
> +++ b/fs/ext4/block_validity.c
> @@ -144,40 +144,6 @@ static void debug_print_tree(struct ext4_sb_info *sbi)
>  	printk(KERN_CONT "\n");
>  }
>  
> -/*
> - * Returns 1 if the passed-in block region (start_blk,
> - * start_blk+count) is valid; 0 if some part of the block region
> - * overlaps with filesystem metadata blocks.
> - */
> -static int ext4_data_block_valid_rcu(struct ext4_sb_info *sbi,
> -				     struct ext4_system_blocks *system_blks,
> -				     ext4_fsblk_t start_blk,
> -				     unsigned int count, ino_t ino)
> -{
> -	struct ext4_system_zone *entry;
> -	struct rb_node *n;
> -
> -	if ((start_blk <= le32_to_cpu(sbi->s_es->s_first_data_block)) ||
> -	    (start_blk + count < start_blk) ||
> -	    (start_blk + count > ext4_blocks_count(sbi->s_es)))
> -		return 0;
> -
> -	if (system_blks == NULL)
> -		return 1;
> -
> -	n = system_blks->root.rb_node;
> -	while (n) {
> -		entry = rb_entry(n, struct ext4_system_zone, node);
> -		if (start_blk + count - 1 < entry->start_blk)
> -			n = n->rb_left;
> -		else if (start_blk >= (entry->start_blk + entry->count))
> -			n = n->rb_right;
> -		else
> -			return entry->ino == ino;
> -	}
> -	return 1;
> -}
> -
>  static int ext4_protect_reserved_inode(struct super_block *sb,
>  				       struct ext4_system_blocks *system_blks,
>  				       u32 ino)
> @@ -333,11 +299,24 @@ void ext4_release_system_zone(struct super_block *sb)
>  		call_rcu(&system_blks->rcu, ext4_destroy_system_zone);
>  }
>  
> +/*
> + * Returns 1 if the passed-in block region (start_blk,
> + * start_blk+count) is valid; 0 if some part of the block region
> + * overlaps with some other filesystem metadata blocks.
> + */
>  int ext4_inode_block_valid(struct inode *inode, ext4_fsblk_t start_blk,
>  			  unsigned int count)
>  {
> +	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>  	struct ext4_system_blocks *system_blks;
> -	int ret;
> +	struct ext4_system_zone *entry;
> +	struct rb_node *n;
> +	int ret = 1;
> +
> +	if ((start_blk <= le32_to_cpu(sbi->s_es->s_first_data_block)) ||
> +	    (start_blk + count < start_blk) ||
> +	    (start_blk + count > ext4_blocks_count(sbi->s_es)))
> +		return 0;
>  
>  	/*
>  	 * Lock the system zone to prevent it being released concurrently
> @@ -346,8 +325,22 @@ int ext4_inode_block_valid(struct inode *inode, ext4_fsblk_t start_blk,
>  	 */
>  	rcu_read_lock();
>  	system_blks = rcu_dereference(sbi->system_blks);
> -	ret = ext4_data_block_valid_rcu(EXT4_SB(inode->i_sb), system_blks,
> -					start_blk, count, inode->i_ino);
> +	if (system_blks == NULL)
> +		goto out_rcu;
> +
> +	n = system_blks->root.rb_node;
> +	while (n) {
> +		entry = rb_entry(n, struct ext4_system_zone, node);
> +		if (start_blk + count - 1 < entry->start_blk)
> +			n = n->rb_left;
> +		else if (start_blk >= (entry->start_blk + entry->count))
> +			n = n->rb_right;
> +		else {
> +			ret = (entry->ino == inode->i_ino);
> +			break;
> +		}
> +	}
> +out_rcu:
>  	rcu_read_unlock();
>  	return ret;
>  }
> -- 
> 2.16.4
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] ext4: Check journal inode extents more carefully
  2020-07-21 10:38   ` Lukas Czerner
@ 2020-07-27 10:59     ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2020-07-27 10:59 UTC (permalink / raw)
  To: Lukas Czerner
  Cc: Jan Kara, Ted Tso, linux-ext4, Ritesh Harjani, Wolfgang Frisch

On Tue 21-07-20 12:38:55, Lukas Czerner wrote:
> > @@ -270,19 +272,19 @@ int ext4_setup_system_zone(struct super_block *sb)
> >  		    ((i < 5) || ((i % flex_size) == 0)))
> >  			add_system_zone(system_blks,
> >  					ext4_group_first_block_no(sb, i),
> > -					ext4_bg_num_gdb(sb, i) + 1);
> > +					ext4_bg_num_gdb(sb, i) + 1, 0);
> 
> Is there a good reason we don't check the return value, it can still
> fail right ?

Yes, it can. I'll add a patch to the series that fixes this.

> Other than that the patch looks good to me.
> 
> Reviewed-by: Lukas Czerner <lczerner@redhat.com>

Thanks for review!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount
  2020-07-21 10:36   ` Lukas Czerner
@ 2020-07-27 11:02     ` Jan Kara
  2020-07-27 11:22       ` Lukas Czerner
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2020-07-27 11:02 UTC (permalink / raw)
  To: Lukas Czerner
  Cc: Jan Kara, Ted Tso, linux-ext4, Ritesh Harjani, Wolfgang Frisch

On Tue 21-07-20 12:36:28, Lukas Czerner wrote:
> On Wed, Jul 15, 2020 at 03:18:09PM +0200, Jan Kara wrote:
> > ext4_setup_system_zone() can fail. Handle the failure in ext4_remount().
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/ext4/super.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 330957ed1f05..8e055ec57a2c 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -5653,7 +5653,10 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
> >  		ext4_register_li_request(sb, first_not_zeroed);
> >  	}
> >  
> > -	ext4_setup_system_zone(sb);
> > +	err = ext4_setup_system_zone(sb);
> > +	if (err)
> > +		goto restore_opts;
> > +
> 
> Thanks Jan, this looks good. But while you're at it, ext4_remount is
> missing ext4_release_system_zone() and so it we want to enable block_validity
> on remount and it fails after ext4_setup_system_zone() we wont release
> it. This *I think* means that we would end up with block_validity
> enabled without user knowing about it ?

And vice-versa, yes. I'll add a patch that fixes this bug to the series but
it's independent issue. Can I add your reviewed-by for this patch?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount
  2020-07-27 11:02     ` Jan Kara
@ 2020-07-27 11:22       ` Lukas Czerner
  0 siblings, 0 replies; 12+ messages in thread
From: Lukas Czerner @ 2020-07-27 11:22 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4, Ritesh Harjani, Wolfgang Frisch

On Mon, Jul 27, 2020 at 01:02:21PM +0200, Jan Kara wrote:
> On Tue 21-07-20 12:36:28, Lukas Czerner wrote:
> > On Wed, Jul 15, 2020 at 03:18:09PM +0200, Jan Kara wrote:
> > > ext4_setup_system_zone() can fail. Handle the failure in ext4_remount().
> > > 
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > ---
> > >  fs/ext4/super.c | 5 ++++-
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > > index 330957ed1f05..8e055ec57a2c 100644
> > > --- a/fs/ext4/super.c
> > > +++ b/fs/ext4/super.c
> > > @@ -5653,7 +5653,10 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
> > >  		ext4_register_li_request(sb, first_not_zeroed);
> > >  	}
> > >  
> > > -	ext4_setup_system_zone(sb);
> > > +	err = ext4_setup_system_zone(sb);
> > > +	if (err)
> > > +		goto restore_opts;
> > > +
> > 
> > Thanks Jan, this looks good. But while you're at it, ext4_remount is
> > missing ext4_release_system_zone() and so it we want to enable block_validity
> > on remount and it fails after ext4_setup_system_zone() we wont release
> > it. This *I think* means that we would end up with block_validity
> > enabled without user knowing about it ?
> 
> And vice-versa, yes. I'll add a patch that fixes this bug to the series but
> it's independent issue. Can I add your reviewed-by for this patch?

Yes, of course. You can add

Reviewed-by: Lukas Czerner <lczerner@redhat.com>

Thanks!
-Lukas

> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-07-27 11:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 13:18 [PATCH 0/4] ext4: Check journal inode extents more carefully Jan Kara
2020-07-15 13:18 ` [PATCH 1/4] ext4: Handle error of ext4_setup_system_zone() on remount Jan Kara
2020-07-21 10:36   ` Lukas Czerner
2020-07-27 11:02     ` Jan Kara
2020-07-27 11:22       ` Lukas Czerner
2020-07-15 13:18 ` [PATCH 2/4] ext4: Don't allow overlapping system zones Jan Kara
2020-07-21 10:36   ` Lukas Czerner
2020-07-15 13:18 ` [PATCH 3/4] ext4: Check journal inode extents more carefully Jan Kara
2020-07-21 10:38   ` Lukas Czerner
2020-07-27 10:59     ` Jan Kara
2020-07-15 13:18 ` [PATCH 4/4] ext4: Fold ext4_data_block_valid_rcu() into the caller Jan Kara
2020-07-21 10:39   ` Lukas Czerner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.